#data-science-and-ml | Python | Page 23

warped gate Oct 17, 2022, 4:44 PM

#

Fellow ChODS member here btw 😄

fresh fable Oct 17, 2022, 4:44 PM

#

ah nice haha

#

im just messing around i learnt this huckel theory approximation quite some time ago but never really understood how they computed the energies for benzene or other larger molecules

#

3x3 or 4x4 matrices were quite easy to examine but the larger ones like benzene made me curious as to how they're computed

split drift Oct 17, 2022, 4:46 PM

#

Does some computation are faster in pyarrow vs numpy?:

warped gate Oct 17, 2022, 4:48 PM

#

fresh fable im just messing around i learnt this huckel theory approximation quite some time...

I see

warped gate Oct 17, 2022, 4:48 PM

#

fresh fable 3x3 or 4x4 matrices were quite easy to examine but the larger ones like benzene ...

That shouldn't be very difficult imo

#

ig for directly putting the values and calculating for matrices with larger dimensions you could use numpy.linalg.det(array)

fresh fable Oct 17, 2022, 4:52 PM

#

but i have variables in the matrix (x)

#

like this

warped gate Oct 17, 2022, 4:54 PM

#

fresh fable but i have variables in the matrix (x)

Hmm, maybe just use an arbitrary real number for x?

fresh fable Oct 17, 2022, 4:55 PM

#

how do you mean?

warped gate Oct 17, 2022, 4:55 PM

#

what is the determinant finding here?

fresh fable Oct 17, 2022, 4:56 PM

#

the determinant gives an equation in x which is set equal to 0

#

in this case it's a quartic equation

#

the roots of this equation are the coefficients i'm looking for

warped gate Oct 17, 2022, 4:58 PM

#

Ahh got it.

#

https://stackoverflow.com/questions/55583653/how-can-you-find-the-determinant-of-the-following-matrix-in-python-a-np-array

#

Take a look at this maybe

fresh fable Oct 17, 2022, 5:18 PM

#

Ah I see, thank you!

heady spoke Oct 17, 2022, 5:33 PM

#

Hello, my name is Agustin. I am from Argentina. I am currently working in data analytics. I am trying to solve a problem with pandas. specifically with the metod .astype() which will be obsoleted in a near future. I dont know how to replace this. Python itself is suggesting this function for replacement: Use obj.tz_localize(None) or obj.tz_convert('UTC').tz_localize(None) instead

#

Can someone help me with this? I dont understand how to replace this function with those options

#

that is the error that I get when executing the astype() method. Its more a warning rather than an error, but in the near future it will become an error

lone nacelle Oct 17, 2022, 6:15 PM

#

Hello, I have a question about numpy, specifically about numpy.linalg.eig. In the documentation, it says that it returns “normalized” eigenvectors. However, I don’t want them normalized for a project I’m working on. I’ve looked at stackoverflow, but there’s no suggestion that doesn’t involve going to another package sympy. Is there any way to use just numpy and calculate the eigenvectors of a matrix without the normalizing?

cursive pond Oct 17, 2022, 6:24 PM

#

Hello, general question about machine learning from a beginner here: what does it mean to train a model? I often see that being said in context of machine learning, but based on my experience with kNN and GAs, I dont really get what it means. How would I "train" my GA or kNN algorithm? Or is training needed for other algorithms and methods besides those simple ones?

Also, how exactly can I imagine machine "learning"? How is my machine "learning" something by using kNN or GAs? I only see it as following a strict pattern/algorithm and coming to a solution that way?

I hope someone can answer my questions, thanks in advance!

young granite Oct 17, 2022, 6:36 PM

#

cursive pond Hello, general question about machine learning from a beginner here: what does i...

so from my point of view (also a beginner), training consists in adjusting the parameters of your network. This is also the reason why you usually divide your data into training and validation data.

#

If you look at this model you will notice a certain peculiarity for areas of your model that are not covered by data.

if you would then analyze known data with it and check the predictions in a truth matrix you could make statements about the quality of your model

cursive pond Oct 17, 2022, 6:40 PM

#

young granite so from my point of view (also a beginner), training consists in adjusting the p...

Okay, so training is not to be understood as the model training itself, but just us humans adjusting parameters so its giving better results?

young granite Oct 17, 2022, 6:40 PM

#

cursive pond Okay, so training is not to be understood as the model training itself, but just...

depends on the training method

cursive pond Oct 17, 2022, 6:40 PM

#

For example genetic algorithms or the k-nearest neighbors algorithm?

young granite Oct 17, 2022, 6:40 PM

#

u got 3 main methods: supervised, unsupervised and reinforced learning

young granite Oct 17, 2022, 6:41 PM

#

cursive pond For example genetic algorithms or the k-nearest neighbors algorithm?

those are algorithm types

#

the method is how ur modell handles ur data

cursive pond Oct 17, 2022, 6:42 PM

#

Can you give an example for that?

young granite Oct 17, 2022, 6:42 PM

#

different algorithms result in different predictions for same dataset

#

Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbour and Naive Bayes are the main ones

#

and its to say that for all the above the input data differs so u have to normalise the inputs but from the original dataset

#

so the dataset always is the same

#

if that makes any sense 🗿

#

maybe @serene scaffold can crosscheck my explanation

cursive pond Oct 17, 2022, 6:49 PM

#

cursive pond Hello, general question about machine learning from a beginner here: what does i...

Thanks for your explanations! Can you also answer my 2nd question there, about how I can imagine this learning process?

young granite Oct 17, 2022, 6:49 PM

#

cursive pond Thanks for your explanations! Can you also answer my 2nd question there, about h...

as stated above it depends on the learning model chosen

young granite Oct 17, 2022, 6:50 PM

#

young granite u got 3 main methods: supervised, unsupervised and reinforced learning

^dis

cursive pond Oct 17, 2022, 6:50 PM

#

Lets say im doing supervised kNN?

#

Or, as another example, unsupervised GA?

young granite Oct 17, 2022, 6:51 PM

#

basically its linear regression and u got functions in ur NN with y=f(x)

cursive pond Oct 17, 2022, 6:52 PM

#

NN stands for...?

young granite Oct 17, 2022, 6:52 PM

#

and u got labelled data with which ur NN can check autonomously if it was right or wrong

#

neural network

cursive pond Oct 17, 2022, 6:53 PM

#

Oh, im not using NNs

young granite Oct 17, 2022, 6:53 PM

#

u always do

cursive pond Oct 17, 2022, 6:54 PM

#

The only thing I did so far is really just implementing the k-nearest neighbor algorithm to predict data, and an evolutionary algorithm

#

Is that not machine learning, if im just using those algorithms without a NN?

young granite Oct 17, 2022, 6:56 PM

#

ok so u just use statistics?

cursive pond Oct 17, 2022, 6:57 PM

#

Hmm yeah I guess

#

I basically got these algorithms from a machine learning tutorial series, so I thought this would already be some sort of learning, but ig its just statistics then? Where/when does the actual learning part come in? With NNs?

strong sedge Oct 17, 2022, 6:59 PM

#

is this a good explanation of gradient decent ?
https://github.com/sivansh11/machine-learning-explained/blob/main/gradient_decent.ipynb

GitHub

machine-learning-explained/gradient_decent.ipynb at main · sivansh1...

Contribute to sivansh11/machine-learning-explained development by creating an account on GitHub.

young granite Oct 17, 2022, 7:01 PM

#

cursive pond I basically got these algorithms from a machine learning tutorial series, so I t...

if u use tensorflow or pytorch 🗿

#

and for my understanding of whats happening inside the hidden layers of a NN -> Black Box cause the Network finds correlations and causalitys from n-data

cursive pond Oct 17, 2022, 7:05 PM

#

Ok so, how is the NN interacting with, for example, the kNN algorithm then on a higher level? Who is influencing/adjusting what?

young granite Oct 17, 2022, 7:06 PM

#

u got input(blue) hidden(yellow,red) and output(green)
all got a function y=f(x) and weights(black strings) now it gives different approaches but easiest is forward so blue->yellow->red->green

#

"what wires together fires together"

#

depending on the value for a given neuron it fires or it wont

#

and thats the learning part

strong sedge Oct 17, 2022, 7:09 PM

#

fyi knn and nn are 2 fundamentally different algorithms
dont get confused

cursive pond Oct 17, 2022, 7:09 PM

#

I got a general idea of NNs, but how do those inputs and outputs now interact with an algorithm like kNN? Where would I put that in a NN?

strong sedge Oct 17, 2022, 7:09 PM

#

knn stands for k nearest neighbours and nn stands for neural networks

serene scaffold Oct 17, 2022, 7:10 PM

#

cursive pond I got a general idea of NNs, but how do those inputs and outputs now interact wi...

NN and kNN are unrelated.

cursive pond Oct 17, 2022, 7:10 PM

#

serene scaffold NN and kNN are unrelated.

Well u can use them tgt tho, cant u?

strong sedge Oct 17, 2022, 7:10 PM

#

strong sedge knn stands for k nearest neighbours and nn stands for neural networks

they do different things internally

cursive pond Oct 17, 2022, 7:10 PM

#

And my question is, how are you using them together?

serene scaffold Oct 17, 2022, 7:10 PM

#

cursive pond Well u can use them tgt tho, cant u?

I guess?

young granite Oct 17, 2022, 7:10 PM

#

cursive pond And my question is, how are you using them together?

classification

#

first then learning

#

statistics always first step

strong sedge Oct 17, 2022, 7:11 PM

#

cursive pond And my question is, how are you using them together?

u dont really use them together

#

they are 2 different things to do something similar

#

like you can ride a car, or a bike

#

both do the same thing

#

but both are fundamentally different

cursive pond Oct 17, 2022, 7:11 PM

#

Alright, is it the same for GAs and NNs? Or is that possible to be used tgt?

strong sedge Oct 17, 2022, 7:12 PM

#

cursive pond Alright, is it the same for GAs and NNs? Or is that possible to be used tgt?

what is tgt 😅

cursive pond Oct 17, 2022, 7:12 PM

#

I saw some of those videos with GAs and NNs on youtube

#

Together

young granite Oct 17, 2022, 7:12 PM

#

@strong sedge i could give statistic functions as neuron functions tho cant i?

strong sedge Oct 17, 2022, 7:12 PM

#

cursive pond Alright, is it the same for GAs and NNs? Or is that possible to be used tgt?

yes, you can use a GA and NN together

strong sedge Oct 17, 2022, 7:13 PM

#

young granite <@654383171735126017> i could give statistic functions as neuron functions tho c...

can you elaborate ?

cursive pond Oct 17, 2022, 7:13 PM

#

strong sedge yes, you can use a GA and NN together

And why not kNNs? Because GAs are just making more sense to be combined with NNs?

young granite Oct 17, 2022, 7:13 PM

#

for my understanding and like a tried to explain to bruce neurons got functions and weights applied to em (y=f(x))

cursive pond Oct 17, 2022, 7:14 PM

#

Maybe because the kNN wouldnt make sense cuz its a supervised algorithm and there wouldnt be much left to learn if ur just comparing the data etc.?

young granite Oct 17, 2022, 7:14 PM

#

so it would be possible to say make a predicition with knn and use the value of it

strong sedge Oct 17, 2022, 7:14 PM

#

cursive pond And why not kNNs? Because GAs are just making more sense to be combined with NNs...

ummm, I would honestly suggest that you understand what neural networks are, k nearest neighbour is and how genetic algorithms work at a deeper level

strong sedge Oct 17, 2022, 7:15 PM

#

young granite so it would be possible to say make a predicition with knn and use the value of ...

no no no,

#

knn != nn

#

they are different

young granite Oct 17, 2022, 7:15 PM

#

ofc

strong sedge Oct 17, 2022, 7:15 PM

#

knn doesnt have a neuron in it

young granite Oct 17, 2022, 7:15 PM

#

i do know that

#

but neurons got functions

#

so i assumed i could apply a function to the neuron like knn

#

different approaches yes

#

and nn is not always >> statistics but i thought i could "combine" if wanted

strong sedge Oct 17, 2022, 7:17 PM

#

no

#

that is not how nn works

young granite Oct 17, 2022, 7:18 PM

#

elaborate pls

strong sedge Oct 17, 2022, 7:20 PM

#

nn doesnt works on statistics, rather calculus
knn also doesnt works on statistics, it works on the distance between points

there is a separate algorithm called naive bayes that works on statistical idea called bayes theorem

strong sedge Oct 17, 2022, 7:20 PM

#

young granite elaborate pls

a neuron in a neural network takes in inputs, does some processing on it and gives some outputs
in function form
y = f(x)

young granite Oct 17, 2022, 7:21 PM

#

yes

strong sedge Oct 17, 2022, 7:22 PM

#

knn works on distance between k points, the resultant value of a new point is the average of the k nearest points

#

2 very different ideas

young granite Oct 17, 2022, 7:23 PM

#

i tired to keep it simple and not explain the underlying idea but i thought i could combine em thanks for correcting me

strong sedge Oct 17, 2022, 7:23 PM

#

no worries

young granite Oct 17, 2022, 7:23 PM

#

but in a nn i work somewhat with statistics cause the system searches for correlations and causalitys doesnt it?

strong sedge Oct 17, 2022, 7:24 PM

#

strong sedge is this a good explanation of gradient decent ? https://github.com/sivansh11/mac...

do give feed back on this, tell me what changes I should make

strong sedge Oct 17, 2022, 7:24 PM

#

young granite but in a nn i work somewhat with statistics cause the system searches for correl...

idrk

young granite Oct 17, 2022, 7:25 PM

#

best answer BLACK BOX 🗿

young granite Oct 17, 2022, 7:25 PM

#

young granite but in a nn i work somewhat with statistics cause the system searches for correl...

@serene scaffold u got any clue on that?

young granite Oct 17, 2022, 7:28 PM

#

strong sedge do give feed back on this, tell me what changes I should make

give me a few mins

#

"note: the multiplier used should be small, there is no fixed value, can **you **what ever works for you"

#

looks fine for me and is sufficient but i wont quiet remember what my prof said a few years back

strong sedge Oct 17, 2022, 7:36 PM

#

young granite "note: the multiplier used should be small, there is no fixed value, can **you *...

ill fix it

#

thanks

young granite Oct 17, 2022, 7:37 PM

#

gladly

lapis sequoia Oct 17, 2022, 9:09 PM

#

if I have an array of zeroes np.zeros(100,336) how do I update the 318th to 325th entries to 1?

serene scaffold Oct 17, 2022, 9:20 PM

#

lapis sequoia if I have an array of zeroes np.zeros(100,336) how do I update the 318th to 325t...

by "entries" do you mean columns?

lapis sequoia Oct 17, 2022, 9:21 PM

#

yes

#

in the first row for example

serene scaffold Oct 17, 2022, 9:22 PM

#

lapis sequoia if I have an array of zeroes np.zeros(100,336) how do I update the 318th to 325t...

arr = np.zeros((100, 336))
arr[:, 318:326] = 1

lapis sequoia Oct 17, 2022, 9:22 PM

#

thanks

#

can i ask a more complicated question?

#

i actually have an array of zeros np.zeros(100,48*7)

#

each row is a factory shift in a data frame

#

and each column is a half hour segement of the week

serene scaffold Oct 17, 2022, 9:23 PM

#

well fuck

lapis sequoia Oct 17, 2022, 9:23 PM

#

i'm trying to iterate over a dataframe

#

which will count how many factory shifts are active at each half hour of the week

#

so something like

#

for r in df.iterrows:

serene scaffold Oct 17, 2022, 9:24 PM

#

when you're doing numpy or pandas, just banish "iterate" from your mind.

lapis sequoia Oct 17, 2022, 9:25 PM

#

ok

#

so what I have so far is

serene scaffold Oct 17, 2022, 9:25 PM

#

hold that thought

#

do print(df.head().to_dict('list')) and put the result in the chat

#

and then explain what you're trying to do, without any code.

#

I won't look at any screenshots.

lapis sequoia Oct 17, 2022, 9:26 PM

#

ok give me one sec

#

thanks

#

crap I dont have access to the file on my home computer

#

guess we can't do it?

serene scaffold Oct 17, 2022, 9:27 PM

#

do you have the name of each column and its dtype memorized?

lapis sequoia Oct 17, 2022, 9:27 PM

#

yes

serene scaffold Oct 17, 2022, 9:28 PM

#

that's what I need to know

lapis sequoia Oct 17, 2022, 9:28 PM

#

factory_position: str

#

day: int

#

shift_start_time: int

#

shift_end_time: int

serene scaffold Oct 17, 2022, 9:29 PM

#

what time unit is start_shift_time? seconds?

lapis sequoia Oct 17, 2022, 9:29 PM

#

its a 24 hour clock

#

so 2300 hours for eample

serene scaffold Oct 17, 2022, 9:29 PM

#

that doesn't answer my question.

#

oh, I see.

lapis sequoia Oct 17, 2022, 9:30 PM

#

how should we proceed? should I tell you what I tried to do?

serene scaffold Oct 17, 2022, 9:31 PM

#

so you need to know how many half-hour blocks in each day (00:00, 00:30, 01:00, etc.) are covered by a factory position?

lapis sequoia Oct 17, 2022, 9:31 PM

#

i need to know how many factory workers/positions are needed at each half hour of the week

serene scaffold Oct 17, 2022, 9:32 PM

#

so you need to know how many factory positions are active during each half-hour block?

lapis sequoia Oct 17, 2022, 9:32 PM

#

eactly

#

*exactly

serene scaffold Oct 17, 2022, 9:32 PM

#

okay, we can work with that.

lapis sequoia Oct 17, 2022, 9:33 PM

#

thanks

#

so we loop through the df right and need to figure out the start half hour and end half hour right?

serene scaffold Oct 17, 2022, 9:34 PM

#

no looping.

lapis sequoia Oct 17, 2022, 9:34 PM

#

ok

#

so whats the approach?

serene scaffold Oct 17, 2022, 9:38 PM

#

the first step is to represent everything as actual timestamps

In [3]: pd.Series([430, 1200, 0000])
Out[3]:
0     430
1    1200
2       0
dtype: int64

In [4]: s = _

In [6]: s.astype(str).str.zfill(4)
Out[6]:
0    0430
1    1200
2    0000
dtype: object

In [7]: pd.to_datetime(s.astype(str).str.zfill(4), format='%H%M')
Out[7]:
0   1900-01-01 04:30:00
1   1900-01-01 12:00:00
2   1900-01-01 00:00:00
dtype: datetime64[ns]

lapis sequoia Oct 17, 2022, 9:39 PM

#

ok got it

serene scaffold Oct 17, 2022, 9:40 PM

#

You can also add the days.

In [9]: pd.Series([1, 2, 3]).astype('timedelta64[D]')
Out[9]:
0   1 days
1   2 days
2   3 days
dtype: timedelta64[ns]

In [10]: pd.to_datetime(s.astype(str).str.zfill(4), format='%H%M') + _
Out[10]:
0   1900-01-02 04:30:00
1   1900-01-03 12:00:00
2   1900-01-04 00:00:00
dtype: datetime64[ns]

lapis sequoia Oct 17, 2022, 9:41 PM

#

ok i'm with you so far

#

the datetime is for the start right?

serene scaffold Oct 17, 2022, 9:42 PM

#

yeah

#

will the start and end always be on an hour or on the half hour? like it will never be at 617 or 1535?

lapis sequoia Oct 17, 2022, 9:44 PM

#

no

serene scaffold Oct 17, 2022, 9:44 PM

#

no to which part

lapis sequoia Oct 17, 2022, 9:44 PM

#

itll always be on the hour or half hour

serene scaffold Oct 17, 2022, 9:48 PM

#

okay. I wonder if there's a way to "expand" each row into one row for each half-hour block

#

you said there's 7 days, right?

lapis sequoia Oct 17, 2022, 9:48 PM

#

yes

serene scaffold Oct 17, 2022, 9:53 PM

#

so you can do this

In [23]: pd.date_range(start='1900-01-01', freq='30min', periods=24 * 2 * 7)
Out[23]:
DatetimeIndex(['1900-01-01 00:00:00', '1900-01-01 00:30:00',
               '1900-01-01 01:00:00', '1900-01-01 01:30:00',
               '1900-01-01 02:00:00', '1900-01-01 02:30:00',
               '1900-01-01 03:00:00', '1900-01-01 03:30:00',
               '1900-01-01 04:00:00', '1900-01-01 04:30:00',
               ...
               '1900-01-07 19:00:00', '1900-01-07 19:30:00',
               '1900-01-07 20:00:00', '1900-01-07 20:30:00',
               '1900-01-07 21:00:00', '1900-01-07 21:30:00',
               '1900-01-07 22:00:00', '1900-01-07 22:30:00',
               '1900-01-07 23:00:00', '1900-01-07 23:30:00'],
              dtype='datetime64[ns]', length=336, freq='30T')

lapis sequoia Oct 17, 2022, 9:54 PM

#

is this on each row?

serene scaffold Oct 17, 2022, 9:54 PM

#

no, this is separate. it's every possible half-hour block

lapis sequoia Oct 17, 2022, 9:55 PM

#

oh i see

serene scaffold Oct 17, 2022, 9:58 PM

#

anyway, you can do something like this

In [36]: blocks = pd.date_range(start='1900-01-01', freq='30min', periods=24 * 2 * 7).to_numpy()[None, :]

In [37]: blocks.shape
Out[37]: (1, 336)

In [38]: shift_starts
Out[38]:
array([['1900-01-01T04:30:00.000000000'],
       ['1900-01-01T12:00:00.000000000'],
       ['1900-01-01T00:00:00.000000000']], dtype='datetime64[ns]')

In [39]: (shift_starts < blocks) & (blocks < shift_ends)
Out[39]:
array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]])

where you use broadcasting to get a 2d array of bools. each column is a block and each row is a shift. and it's True if that shift overlaps with that block

lapis sequoia Oct 17, 2022, 10:00 PM

#

i see

#

with you so far

serene scaffold Oct 17, 2022, 10:07 PM

#

well, that's it

lapis sequoia Oct 17, 2022, 10:09 PM

#

how do you get to a count of shift for each half hour?

serene scaffold Oct 17, 2022, 10:11 PM

#

sum of each row

lapis sequoia Oct 17, 2022, 10:12 PM

#

ok cool

#

thanks for your help

strong sedge Oct 17, 2022, 11:06 PM

#

'''
        y = wx + b
        dy = dwx + wdx + db
        
        dy / dw = dw * x / dw + w * dx / dw + db / dw
        
        what I think should be correct
        dy / dw = x
        
        dy / db = 1
        
        dy / dx = w
        '''

this is technically wrong
it should be

#

'''
    dw = dy * x
    db = dy
    dx = dy * w
'''```

strong sedge Oct 17, 2022, 11:07 PM

#

strong sedge ```py ''' y = wx + b dy = dwx + wdx + db dy / d...

what am I doing wrong here ?

fringe anvil Oct 17, 2022, 11:36 PM

#

what style of plot is this?

#

ive looked at all the examples on matplotlib

merry pike Oct 17, 2022, 11:48 PM

#

I have a h5 model, but how do I run it with opencv

Screen_Shot_2022-10-17_at_7.48.16_PM.png

#

this is the code im using, but it doesnt work : ```from cvzone.ClassificationModule import Classifier
import cv2

cap = cv2.VideoCapture(0)
myClassifier = Classifier('eyedisease.h5','labels.txt')

while True:
_, img = cap.read()
predictions, index = myClassifier.getPrediction(img)
print(predictions)

cv2.imshow("Image", img)
cv2.waitKey(1)

#

gives this error

Screen_Shot_2022-10-17_at_7.54.19_PM.png

rare socket Oct 18, 2022, 12:04 AM

#

my reinforced learning network has 5 inputs and 3 outputs. No matter how many middle layers there are or how many nodes it has, my output is always only 1 option. I have tried different training algorithms and different activation functions but nothing works. Do I not have enough input nodes or something? I am not sure what to do. I would appreciate the help

novel python Oct 18, 2022, 12:30 AM

#

how can I compare how many "lowest" values a column has compared to 3 other columns?

woven pasture Oct 18, 2022, 12:42 AM

#

why does jax.grad fail on the following method

def U(x):
    return np.sum(np.linalg.norm(x[:, None, :] - x[None, :, :], axis=-1))```

#

i truncated to np.linalg.norm after trying with np.sqrt(np.sum(np.square(... for a while

#

jax.grad is possible up until the np.sqrt part

#

also np is jax.numpy, not the standard numpy

rigid wadi Oct 18, 2022, 1:15 AM

#

Hi, has anyone worked with receipt data extraction before? Like extract the invoice number, receipt date and amount etc..
Is there any model that are ready to train for this?

fleet pulsar Oct 18, 2022, 1:17 AM

#

can anyone tell me good course about data science ?

hasty mountain Oct 18, 2022, 2:11 AM

#

Guys, any tips on how to deal with vanishing gradients in a discriminator from a GAN?
(My discriminator has only 3 layers and its optimizer is an adam with lr=1)

#

I can think about residual blocks and batchnormalization, but I suppose residual blocks aren't really a good option for a GAN, right?

desert parcel Oct 18, 2022, 2:53 AM

#

Hello. I have plotted a histogram for temperature to see which has temperature occurs the most on any given day. And I have a question about the mode. The mode is at the red line. But I can see that on the right there's a temperature value that occurs more often. So why is the mode shown on the left instead of the right? If needed the data comes from here https://raw.githubusercontent.com/MicrosoftDocs/mslearn-introduction-to-machine-learning/main/Data/ml-basics/daily-bike-share.csv

woeful hedge Oct 18, 2022, 4:37 AM

#

ACCURACY = THE POINT AND RANGE OF A MEASURED AMOUNT OF CAPABILITY A POSSIBILITY CAN HAPPEN AND DETERMINE COME INTO EFFECT
RADIUS = SET RANGE OF A CENTERED POINT TO THE END DESTINATION
DIAMETER = SET RANGE POINT FROM START TO MIDDLE TO THE END WHILE PASSING THE RADIUS
CONVERT = CHANGE FORM AND OR CHARACTER AND OR FUNCTION
PATTERN = REPEATING METHOD
WRITE = ENSCRIBE FROM LOOKING AT WORDS
READ = DESCRIBE FROM LOOKING AT A PATH OF WORDS
SPAN = MEASURED LIMITED RANGE
VIBRATION = PARTS THAT MOVE BACK AND FORTH AT A GIVEN SPEED
TRANSFORM = MAKE A CHANGE IN FORM
SYNCHRONIZE = LINK AND SEND THE SAME RESULT TO ALL SOURCES
SCAN = ANALYZE A SPECIFIC WORD OR FIELD AND OR GIVE DATA ON THE ASKED INFORMATION TO SEARCH FOR
ANALYZE = READ AND LOOK OVER
CALCULATE = GIVE A DESIGNATED OF A CALCULATES DESCRIPTION FOR A NUMBER AND GIVE ANSWER FOR ALL OF VALUE
LIMIT = SET DEFINED AMOUNT FOR KNOWLEDGE WITH A GIVEN POWER LEVEL
RECALL = GAIN THE ABILITY TO VIEW PAST MEMORY INSIDE BRAIN
REACH = GRAB TO PULL INWARDS
PREDICT = GIVE PERFECT VALUE
REPEAT = CYCLE SAME EFFECT AGAIN INTO SAME FREQUENCY
RECOGNIZE = RECALL FROM AN EARLIER POINT WITHIN TIME
ENCODE = COMPRESS CODE
DECODE = DECOMPRESS CODE
RECODE = COMPRESS CODE ONCE MORE

#

LOOP = BIND IN A CYCLE
MEASURE = TAKE IN THE AMOUNT AND DISTANCE OF
ANSWER = SOLUTION TO A PROBLEM
SOLUTION = FINAL OUTCOME TO AN FORMULA
PROBLEM = UNFINISHED SOLUTION
SEARCH = FIND AND LOCATE SOMETHING
ASK = STATE A QUESTION
TIME = MEASUREMENT IN WHICH CURRENT REALITIES MUST PASS
SPACE = CONTAINER IN WHICH TIME MUST PASS THROUGH
UPLOAD = TRANSFER3 INTO DESCRIBED LOCATION
DOWNLOAD = TRANSFER3 TO CURRENT DEVICE
SIDELOAD = TRANSFER3 TO ALL DEVICES WITH STATUS OF STATED SET LOCATION
CLONE2 = MAKE AN IDENTICAL COPY OF
SYNCHRONIZE = LINK AND SEND THE SAME RESULT TO ALL SOURCES
ENCODE = COMPRESS CODE
DECODE = DECOMPRESS CODE
RECODE = COMPRESS CODE ONCE MORE
SETTING = A MEASUREMENT COMMAND THAT CAN BE ADJUSTED AND BY AN OPERATOR
ADJUST = EDIT AND MODIFY
EDIT = CHANGE AND OR MODIFY TO ADJUST TO A SPECIFIED PURPOSE
WORK = PRODUCING EFFORT TO FINISH A TASK
WORKLOAD = THE AMOUNT OF WORK
COMMAND = ORDER TO BE GIVEN
LINK = BRING TOGETHER AND ATTACH TO
BIND = EDIT AND MODIFY
LEVEL = NUMBER AMOUNT OF OR SIZE
UNIT = STORAGE CONTAINER
DIMENSION = NUMBER OF GIVEN AXIS POINTS
NUMBER = ARITHMETICAL VALUE THAT IS EXPRESSED BY A WORD AND OR SYMBOLE AND OR FIGURE REPRESENTING A PARTICULAR QUANTITY AND USED IN COUNTING AND MAKING CALCULATIONS AND OR FOR SHOWING ORDER IN A SERIES OR FOR IDENTIFICATION
FREQUENCY = REPEATED PATTERN AND OR SETTING

#

POWER = AMOUNT
STRENGTH = LEVEL INTENSITY
CALIBRATE = SCALE WITH A STANDARD SET OF READINGS THAT CORRELATES THE READINGS WITH THOSE OF A STANDARD IN ORDER TO CHECK THE INSTRUMENT AND ITS ACCURACY
PUBLIC = ACCESS TO ALL OF CREATORS INTERIOR DOMINION
PRIVATE = HIDDEN TO EVERYONE BUT CURRENT2 USER2
PERSONAL = EXCLUSIVE TO THE CREATOR
ESCAPE = RETURN TO SOURCE PLACE2
RETURN = GO BACK
CONSTANT = ALWAYS IN EFFECT
CYCLE = PROCESS OF REPEATING AN EVENT CONTINUOUSLY IN THE SAME ORDER
MEASUREMENT = AN ACT TO CALCULATE AND GIVE A SPECIFIC LENGTH ON SOMETHING
CALCULATOR = A DEVICE USED TO CALCULATE INFORMATION AND ANALYZE SET TASKS AS A ROOT VALUE OF LOGIC
WAVELENGTH = A SET OF WAVE PATTERNS GIVEN FREQUENCY FORMAT IN A LENGTH OF A WAVE VALUE DETERMINED BY A PREVIOUS VALUE EFFECT
LENGTH = HOW LONG A MEASURED DIMENSIONAL OBJECT IS EXTENDED
LATTICE = INTERLACED STRUCTURE AND OR PATTERN
LOCATION = SPECIFIED AREA
LINE2 = CHOSEN DIRECTION THAT IS SET IN A SINGLE PATH
WAVE2 = CONTINUAL FLUCTUATION OF FREQUENCY AND OR PATTERN
WIDTH = MEASUREMENT OF SOMETHING FROM SIDE TO SIDE
HEIGHT = THE LENGTH OF RAISING OR LOWERING IN A VERTICAL PATH
HERTZ = DEFINED SOUND WAVE FREQUENCY
MEASURE = TAKE IN THE AMOUNT AND DISTANCE OF

#

Those are the mathematical variables my language has
just some and not a complete list yet

#

looking for input and feedback and what others think of it. How others see it could be used if I made it as a smaller library/module for it to connect to the full language with.
What others see within its potential as well

thorn nova Oct 18, 2022, 4:43 AM

#

Can anyone help me turn excel data into something that can be worked with in python? I'm new to data science and have already tried all the built in functions from pandas but it can never recognize my file for some reason, not sure if i should be saving it in a particular place first? Would appreciate if someone could hop on a call or something and help me work through this!

wary breach Oct 18, 2022, 5:26 AM

#

Bit confused about how to combine two different models into one. I.e. if I fit a linear regression model and also fit a XGBoost model to a dataset. I know sometimes you can get better scores utilizing both models but am unsure how to go about this process. Can anyone point me in the right direction? Thanks! (@ me on reply if you can please)

wary breach Oct 18, 2022, 5:27 AM

#

thorn nova Can anyone help me turn excel data into something that can be worked with in pyt...

Dm me what your file formatting looks like I can help point you in the right direction

topaz night Oct 18, 2022, 5:29 AM

#

woeful hedge ACCURACY = THE POINT AND RANGE OF A MEASURED AMOUNT OF CAPABILITY A POSSIBILITY ...

holy sht

woeful hedge Oct 18, 2022, 5:35 AM

#

@topaz night You Like it

hasty mountain Oct 18, 2022, 6:17 AM

#

wary breach Bit confused about how to combine two different models into one. I.e. if I fit a...

If you're using keras, you can pass the output of a model as input of another model.
Example: create a convolution model to extract features from an image and pass those features inti XGBoost. Or you can extract features with PCA and pass them into a Decision Tree.

If you're using tensorflow or Pytorch for neural networks, things can get more interesting, as you can create a Neural Network with XGBoost inside of it.

wary breach Oct 18, 2022, 6:52 AM

#

hasty mountain If you're using keras, you can pass the output of a model as input of another mo...

I was using sklearn for both

ebon jewel Oct 18, 2022, 7:19 AM

#

Need help with pythpn and pandas code

mint palm Oct 18, 2022, 7:19 AM

#

Traceback (most recent call last):
  File "/local/scratch/v_rahul_pratap_singh/UnsupervisedVAD/video_feature_extractor/extract.py", line 50, in <module>
    model = get_model(args)
  File "/local/scratch/v_rahul_pratap_singh/UnsupervisedVAD/video_feature_extractor/model.py", line 32, in get_model
    model = model.cuda()
  File "/shared/home/v_rahul_pratap_singh/miniconda3/envs/envRahul/lib/python3.10/site-packages/torch/nn/modules/module.py", line 689, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/shared/home/v_rahul_pratap_singh/miniconda3/envs/envRahul/lib/python3.10/site-packages/torch/nn/modules/module.py", line 579, in _apply
    module._apply(fn)
  File "/shared/home/v_rahul_pratap_singh/miniconda3/envs/envRahul/lib/python3.10/site-packages/torch/nn/modules/module.py", line 602, in _apply
    param_applied = fn(param)
  File "/shared/home/v_rahul_pratap_singh/miniconda3/envs/envRahul/lib/python3.10/site-packages/torch/nn/modules/module.py", line 689, in <lambda>
    return self._apply(lambda t: t.cuda(device))
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.```
this is error or what, my code seems to continuing running and making relavant files but this pops

ebon jewel Oct 18, 2022, 7:19 AM

#

Anyone interested to have a look in my project and help me

strong sedge Oct 18, 2022, 8:04 AM

#

ebon jewel Anyone interested to have a look in my project and help me

ask your problem, some one will help

strong sedge Oct 18, 2022, 8:06 AM

#

hasty mountain Guys, any tips on how to deal with vanishing gradients in a discriminator from a...

lr = 1 is too much ?

strong sedge Oct 18, 2022, 8:07 AM

#

thorn nova Can anyone help me turn excel data into something that can be worked with in pyt...

export it as csv in excel
in python, import pandas as pd and then use pd.read_csv()

strong sedge Oct 18, 2022, 8:08 AM

#

wary breach Bit confused about how to combine two different models into one. I.e. if I fit a...

its called stacking, take a read at this https://www.javatpoint.com/stacking-in-machine-learning#:~:text=Stacking is one of the,new model with improved performance.

www.javatpoint.com

Stacking in Machine Learning - Javatpoint

Stacking in Machine Learning with Tutorial, Machine Learning Introduction, What is Machine Learning, Data Machine Learning, Machine Learning vs Artificial Intelligence etc.

wary breach Oct 18, 2022, 8:11 AM

#

strong sedge its called stacking, take a read at this https://www.javatpoint.com/stacking-in-...

Awesome thanks! I'll watch some videos on it

ebon jewel Oct 18, 2022, 8:43 AM

#

@strong sedge can we connect need to share my screen and make you understand my problem

silent stump Oct 18, 2022, 9:09 AM

#

Hi guys ive got my entry, take profit, and stop loss stored in my dataframe, but cant figure out how to track the profit and loss of the strategy. Any advice? thanks. This is for a trading strategy

topaz night Oct 18, 2022, 9:56 AM

#

woeful hedge <@705127109261000836> You Like it

yeah thats really cool bro like damn yk

topaz night Oct 18, 2022, 10:02 AM

#

strong sedge ask your problem, some one will help

what if myself is the problem :v

inland eagle Oct 18, 2022, 10:12 AM

#

    return int(v.strip(',')) ```
does anyone know why a .strip won't work in this instance. i am trying to pull from a collum in a data frame where it is all strings since the number values have commas (EX: 36,098) but i want to convert all of those values to ints without commas
```---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_191/1557144099.py in <module>
      2 def convert_votes_to_int(v):
      3     return int(v.strip(','))
----> 4 video_games = video_games.con
      5 video_games

AttributeError: 'DataFrame' object has no attribute 'con'```
this is the error message i am getting

steady basalt Oct 18, 2022, 10:22 AM

#

anyone know a cute way of getting the bottom 3 strings from a specific column? so lets say i did .tail(3) I want 3 values from that, it would be the same column for each row

#

lets call that rowCbottom3 = []

#

and then to match for row B

#

sorry

#

not row, COL

#

ah, worked it out

hushed kraken Oct 18, 2022, 12:10 PM

#

How can I see if my prediction model is the best model?

serene scaffold Oct 18, 2022, 12:13 PM

#

hushed kraken How can I see if my prediction model is the best model?

what does the model predict?

hushed kraken Oct 18, 2022, 12:15 PM

#

serene scaffold what does the model predict?

one model for solar energy production and another one for electricity prices

desert oar Oct 18, 2022, 12:16 PM

#

hushed kraken How can I see if my prediction model is the best model?

"best" by what standard?

hushed kraken Oct 18, 2022, 12:17 PM

#

accuracy I guess

desert oar Oct 18, 2022, 12:17 PM

#

i am not being glib. that's a legitimate question and an important one that you must answer in any modeling project!

#

in general, it's hard to know if your model is "best" but you can compare various models to see which one is better

hushed kraken Oct 18, 2022, 12:18 PM

#

So the only way is by testing multiple models and comparing them?

desert oar Oct 18, 2022, 12:18 PM

#

in general yes. in statistics specifically, certain models have certain desirable mathematically-proven characteristics in some situations.

#

but that also doesn't make them "best" for any particular application

#

bias-variance tradeoff is also important to consider

#

would you prefer a model with really small average error, but huge variation in the predictions? or would you prefer a model with modest average error, but less variation in the predictions?

hushed kraken Oct 18, 2022, 12:21 PM

#

desert oar would you prefer a model with really small average error, but huge variation in ...

wdym by huge variation in predictions?

desert oar Oct 18, 2022, 12:21 PM

#

if you don't know what "bias-variance" tradeoff is, go look it up right now

hushed kraken Oct 18, 2022, 12:21 PM

#

ok thnx

hushed kraken Oct 18, 2022, 12:30 PM

#

desert oar would you prefer a model with really small average error, but huge variation in ...

The best would be ofc a small bias and small variation, but I think for our project a small bias would be more important

desert oar Oct 18, 2022, 12:32 PM

#

hushed kraken The best would be ofc a small bias and small variation, but I think for our proj...

ok, but then you have to be comfortable with the chance that your particular sample gives substantially "incorrect" results!

hushed kraken Oct 18, 2022, 12:33 PM

#

but wouldnt a bigger bias also give incorrect results?

desert oar Oct 18, 2022, 12:34 PM

#

this often comes up with observational data, such as that collected from the environment. it's often helpful to think of "the environment" as a big random sampling engine: physical phenomena are the outcomes of random data generating processes. you get exactly one opportunity to observe that data generating process, because time only runs forward!

#

so it's tempting to look at a time series at the millisecond scale of something like solar energy, and conclude that you have a big data set, and therefore that you don't care about variance and must minimize bias. but there is a legitimate interpretation in which you have a data set of exactly 1 data point.

hushed kraken Oct 18, 2022, 12:38 PM

#

so its actually better to find a balance between the bias and variation error?

desert oar Oct 18, 2022, 12:39 PM

#

hushed kraken so its actually better to find a balance between the bias and variation error?

yes. it's not something you can always tune precisely, but it's something important to consider when asking what the "best" model is

hushed kraken Oct 18, 2022, 12:41 PM

#

And how can I calculate these, because right now I'm only calculating the mse of the last training data and the mse of the prediction

#

Also another question, since I am using 2 models for the energy prices and solar energy production, would stacking be a good method to make a more accurate prediction?

grand olive Oct 18, 2022, 1:51 PM

#

i need help choosing between tf and pytorch.
i've read that pytorch pretty much beats tf when it comes to use in research, and is starting to get more and more popular in the industry

i'm a bit concerned about deployment though (i'm only concerned about deploying to web apps)
read that it's a bit harder to deploy with pytorch. is this still true? or has it become easier to deploy pytorch now?

my interests include mostly NLP(mostly japanese) and music (music theory, metadata, genres)

fast rivet Oct 18, 2022, 2:04 PM

#

this command Dataset.from_dict(dutch_dict) gives me this error pyarrow.lib.ArrowInvalid: Column 1 named validation expected length 43410 but got length 5426
I just want to convert a dictionary to a Dataset object which I've imported from datasets but I don't know why I'm getting this error.

lean jacinth Oct 18, 2022, 2:17 PM

#

grand olive i need help choosing between tf and pytorch. i've read that pytorch pretty much ...

Do both

#

Tensorflow still has the most weight behind it, but it's a bit of a relic
Pytorch is the up and comer and will likely overtake eventually

grand olive Oct 18, 2022, 2:22 PM

#

lean jacinth Do both

aight thanks

#

last question
how hard is it to deploy to the web with pytorch vs tensorflow as of now? (the articles i've been reading are from 1-2 years ago and i'd guess pytorch has improved since then)

lean jacinth Oct 18, 2022, 2:24 PM

#

grand olive last question how hard is it to deploy to the web with pytorch vs tensorflow as ...

If you're using cloud platforms they're both as easy as each other, not sure about 3rd party software though

#

Like I use GCP for model deployment and both are integrated in the same way

hushed kraken Oct 18, 2022, 2:38 PM

#

I got this error and can't fix it pls help : (

 Graph execution error:

fleet pulsar Oct 18, 2022, 3:02 PM

#

lean jacinth Oct 18, 2022, 3:02 PM

#

fleet pulsar

Real programmers delete and start from scratch whenever they reload their IDE

fleet pulsar Oct 18, 2022, 3:03 PM

#

lean jacinth Real programmers delete and start from scratch whenever they reload their IDE

but i am beginner

lean jacinth Oct 18, 2022, 3:03 PM

#

fleet pulsar but i am beginner

Then begin

fleet pulsar Oct 18, 2022, 3:04 PM

#

i started python

#

today

#

i feel hardness

hasty mountain Oct 18, 2022, 3:05 PM

#

strong sedge lr = 1 is too much ?

Kinda... It's quite rare to see algorithms with lr = 1

hasty mountain Oct 18, 2022, 3:05 PM

#

wary breach I was using sklearn for both

Oh, you can ues sklearn, too. My head was still in the neural networks

lean jacinth Oct 18, 2022, 3:06 PM

#

hasty mountain I can think about residual blocks and batchnormalization, but I suppose residual...

Batchnorm tends to be my go to for GAN vanishing gradients, are they not working for your case?

hasty mountain Oct 18, 2022, 3:06 PM

#

In sklearn you can make a model's output be another model's input

hasty mountain Oct 18, 2022, 3:06 PM

#

lean jacinth Batchnorm tends to be my go to for GAN vanishing gradients, are they not working...

No, they weren't. I had to lower the Linear layers in the discriminator

#

Use less neurons

lean jacinth Oct 18, 2022, 3:07 PM

#

Ah

hasty mountain Oct 18, 2022, 3:09 PM

#

However, it seems that it's stabilized for now... I'm adding random noise to the discriminator's inputs, using label-smoothing, weights initialization...

#

Aaaand updating after each batch, instead of each epoch

strong sedge Oct 18, 2022, 3:11 PM

#

hasty mountain Kinda... It's quite rare to see algorithms with lr = 1

First time I am seeing something like this
What are you doing?

hasty mountain Oct 18, 2022, 3:12 PM

#

strong sedge First time I am seeing something like this What are you doing?

A Text GAN

strong sedge Oct 18, 2022, 3:17 PM

#

hasty mountain A Text GAN

Ohhh cool

minor coral Oct 18, 2022, 3:35 PM

#

hii

#

can anyone of u knows what I can do in #help-honey

rare socket Oct 18, 2022, 4:44 PM

#

hello, I am trying to change a single weight and bias in my model but I am not sure how to go about. Is there some sort of indexing through the model? model[column][row] <-- like this?

#

I'm using pytorch

hasty mountain Oct 18, 2022, 4:46 PM

#

rare socket hello, I am trying to change a single weight and bias in my model but I am not s...

Yes, you can access a model parameters by using a loop with model.parameters()


for param in model.parameters():
    print(param)

or

for name, param in model.named_parameters():
    print(name, param)

rare socket Oct 18, 2022, 4:47 PM

#

thank you

rare socket Oct 18, 2022, 5:07 PM

#

is there a way to index model.parameters() without the loop?

plush jungle Oct 18, 2022, 5:09 PM

#

I got this error

#

RuntimeError: CUDA out of memory. Tried to allocate 2.43 GiB (GPU 0; 8.00 GiB total capacity; 5.70 GiB already allocated; 0 bytes free; 6.52 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

#

which I find strange, because my memory is supposed to be 16 gb

#

so why does it say 8.00 GiB total capacity?

agile cobalt Oct 18, 2022, 5:33 PM

#

plush jungle ``` RuntimeError: CUDA out of memory. Tried to allocate 2.43 GiB (GPU 0; 8.00 Gi...

probably GPU memory vs RAM memory?

hasty mountain Oct 18, 2022, 5:33 PM

#

plush jungle which I find strange, because my memory is supposed to be 16 gb

CUDA uses the Video memory, not the pc RAM memory

#

Try checking your dxdiag

plush jungle Oct 18, 2022, 5:34 PM

#

so my gpu only has 8gb of memory?

hasty mountain Oct 18, 2022, 5:34 PM

#

Only

#

Yes

plush jungle Oct 18, 2022, 5:34 PM

#

hasty mountain > Only

is that a lot? stylegan3 recommends 16gb

hasty mountain Oct 18, 2022, 5:35 PM

#

plush jungle is that a lot? stylegan3 recommends 16gb

Because it probably relies on multiple GPUs from cloud servers

#

Those big models usually does that

agile cobalt Oct 18, 2022, 5:37 PM

#

plush jungle is that a lot? stylegan3 recommends 16gb

well yeah, some huge neural network requires completely absurd amounts of memory that no personal computer should have, and are expected to run (exclusively) on cloud providers

plush jungle Oct 18, 2022, 5:38 PM

#

task manager says I've got 2 gpus, a 3080 and "AMD radeon Graphics"

#

but when I try to train neural nets it only lets me use the 3080

#

is the amd one not a real gpu or something?

agile cobalt Oct 18, 2022, 5:38 PM

#

CUDA = nvidia = probably doesn't supports amd I guess

hasty mountain Oct 18, 2022, 5:39 PM

#

agile cobalt CUDA = nvidia = probably doesn't supports amd I guess

It doesn't grumpchib

plush jungle Oct 18, 2022, 5:39 PM

#

so why would the manufacturer of the computer put two incompatible gpus?

agile cobalt Oct 18, 2022, 5:39 PM

#

they're not incompatible per se
it's just not compatible with cuda

plush jungle Oct 18, 2022, 5:39 PM

#

oh

hasty mountain Oct 18, 2022, 5:40 PM

#

Unfortunately, it doesn't seem to be that easy to use AMD GPUs for neural networks... I remember I tried to search about it, and...nothing.

#

Maybe most algorithms tend to rely on NVidia GPUs because of that...except for Google's, since they like to use their TPUs

agile cobalt Oct 18, 2022, 5:41 PM

#

if you're just messing with GANs for fun or even some project you haven't got much progress into yet, you might as well move on to Stable Diffusion tbh - almost completely (if not completely) different network architecture, but generalises a lot better as far as I know

plush jungle Oct 18, 2022, 5:42 PM

#

agile cobalt if you're just messing with GANs for fun or even some project you haven't got mu...

really? I've heard a lot of hype about stable diffusion recently, but I thought it's still worse at high quality images and face datasets

hasty mountain Oct 18, 2022, 5:42 PM

#

agile cobalt if you're just messing with GANs for fun or even some project you haven't got mu...

But diffusion models tend to be heavier than GANs

agile cobalt Oct 18, 2022, 5:43 PM

#

hasty mountain *But diffusion models tend to be heavier than GANs*

do them? I don't know how much you need to train it, but I know that you can run inference with 8GB for sd

agile cobalt Oct 18, 2022, 5:43 PM

#

plush jungle really? I've heard a lot of hype about stable diffusion recently, but I thought...

weren't you doing some Avatar the last airbender stuff?

plush jungle Oct 18, 2022, 5:43 PM

#

agile cobalt weren't you doing some Avatar the last airbender stuff?

yep!

agile cobalt Oct 18, 2022, 5:44 PM

#

that doesn't really fits into neither "high quality" nor "face datasets" I think?

plush jungle Oct 18, 2022, 5:46 PM

#

my main goal was to generate new character designs, kind of like a "novel pokemon gan" I saw someone do

#

but I trained a vanilla gan from scratch and the results were both very blurry and extremely overfit

#

and then I discovered thiswaifudoesnotexist, which retrained stylegan2 on a small dataset of anime girls

hasty mountain Oct 18, 2022, 5:47 PM

#

agile cobalt do them? I don't know how much you need to train it, but I know that you can run...

Oh, inference is usually ok, the training is the problem.
The diffusion model creates too many samples for a single training loop.

plush jungle Oct 18, 2022, 5:47 PM

#

and it was super crisp

#

so now I'm trying to retrain stylegan3 on my dataset

#

to achieve the same result

hasty mountain Oct 18, 2022, 5:48 PM

#

OpenAI even developed a diffusion model that is better than DeepMind's BigGAN, which is the state of the art GAN, but the computation power that thing demands...
Each checkpoint file has, like, 1 Gb.

agile cobalt Oct 18, 2022, 5:49 PM

#

plush jungle and it was super crisp

link?
I found two github repos about Pokemon GAN, both of which are very much not high quality at all

plush jungle Oct 18, 2022, 5:49 PM

#

agile cobalt link? I found two github repos about Pokemon GAN, both of which are very much *n...

yeah they weren't great

#

but the waifu one was excellent

hasty mountain Oct 18, 2022, 5:50 PM

#

plush jungle but I trained a vanilla gan from scratch and the results were both very blurry a...

What was the size of images you tried to generate? 64x64?

#

Blurry images tend to be normal, so GAN models usually rely on SuperResolution nets...
Maybe some of them don't, but others do.
I think BigGAN uses something to avoid this, but it was so complicated that I can't remember... but there's NVidia's Progressive Grow, which uses a GAN that grows after each training session and generates quite interesting images with quite a resolution.

mint palm Oct 18, 2022, 6:08 PM

#

i am using ssh.
if i clean GPU cache because i am getting CUDA out of memory, will it affect others using that GPU?

timid kiln Oct 18, 2022, 6:28 PM

#

Working with dates/times in a pandas dataframe.

One of the columns in a df of data from our SQL server is ip_date (initial production date). Pandas says it's type object. I need to work on this as a date, so I run .to_datetime on it, and now its type datetime64[ns]. However, when I try to get the data type off of an individual value in that column of data, its type is <class 'pandas._libs.tslibs.timestamps.Timestamp'>.

  meters_sql = #result of the sql query
  print(meters_sql.dtypes) # Says column `ip_date` is `object`
  meters_sql['ip_date'] = pd.to_datetime(meters_sql['ip_date'])
  print(meters_sql.dtypes) # Says column `ip_date` is `datetime64[ns]
  print(type(meters_sql['ip_date'][1])) # Says it's type ...timestamps.Timestamp

How do I force this to be a datetime? Or what module would I use to work with timestamp?

agile cobalt Oct 18, 2022, 6:29 PM

#

timestamps are pandas's version of datetimes

#

!e import pandas; print(pandas.Timestamp.mro())

arctic wedgeBOT Oct 18, 2022, 6:29 PM

#

@agile cobalt :white_check_mark: Your 3.11 eval job has completed with return code 0.

[<class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'pandas._libs.tslibs.timestamps._Timestamp'>, <class 'pandas._libs.tslibs.base.ABCTimestamp'>, <class 'datetime.datetime'>, <class 'datetime.date'>, <class 'object'>]

agile cobalt Oct 18, 2022, 6:30 PM

#

pandas.Timestamp is to datetime.datetime what numpy.float64 is to float

timid kiln Oct 18, 2022, 6:33 PM

#

agile cobalt `pandas.Timestamp` is to `datetime.datetime` what `numpy.float64` is to `float`

OK... So the reason I'm asking this is I tried to run (forgive me for using terms badly) a list comprehension on the df to replace all the day values in the dates with the number 1. So 5/14/2022 wwould become 5/1/2022. I'm very much a beginner with list comprehensions so I tried this and got an error:

meters_sql['ip_date'] = [meters_sql['ip_date'].replace(day=1) for x in meters_sql['ip_date']]

error message: Series.replace() got an unexpected keyword argument 'day'

#

So my first thought was that the type of data in that series is not datetime so that's how I got to where I am now.

#

OK, so I think I figured out the first part of the list comprehension error. I have this now:

meters_sql['ip_date'] = [meters_sql['ip_date'][x].replace(day=1) for x in meters_sql['ip_date']]

The error message is: Exception has occurred: KeyError Timestamp('2018-05-19 00:00:00')

I'm at a loss as to what to do here.

agile cobalt Oct 18, 2022, 6:57 PM

#

timid kiln OK... So the reason I'm asking this is I tried to run (forgive me for using term...

don't ever iterate over pandas dataframes - specially, do not use list comprehensions for that kind of stuff.

timid kiln Oct 18, 2022, 6:58 PM

#

agile cobalt don't ever iterate over pandas dataframes - specially, do not use list comprehen...

So use if/else instead?

agile cobalt Oct 18, 2022, 6:58 PM

#

loop up pandas vectorized operations

timid kiln Oct 18, 2022, 6:58 PM

#

hokey pokey, thx 🙂

#

Sounds complicated so if I start dropping those words around the developers maybe they'll think I'm smart lol

#

Oh man, that looks a lot simpler and easier to understand. At least the first couple examples I see.

agile cobalt Oct 18, 2022, 6:59 PM

#

explicit loops are as bad as (or even worse than) pure python code without pandas
apply()/map() with user defined functions is bad and shouldn't be used either, but still beats explicit loops
you should always use specific built-in methods that operate over the entire series

timid kiln Oct 18, 2022, 7:03 PM

#

agile cobalt explicit loops are as bad as (or even worse than) pure python code without panda...

Understood. This discord is my main point of education for such things (specific built-in methods). Thank you!

So to vectorize the replacement of the day in the field ip_date, it would be something like this I suppose?

df['ip_date'] = df['ip_date'].replace(day=1)

agile cobalt Oct 18, 2022, 7:05 PM

#

pretty much

#

I recommend taking a look at the pandas documentation at https://pandas.pydata.org/docs/user_guide if you haven't yet

digital locust Oct 18, 2022, 7:39 PM

#

Hey there! I'm building a Django app and I use pandas a lot to process data. I have come across one big problem: at some point in my app, data analysis takes like forever. I have the following code:

   i = df['agencia'] == 'DHL'
    for row in tqdm(df[i].index):
        for col in df.columns:
            for supplement_col in supplements_columns_names:
                for supplement_col_total in supplements_columns_names_total:
                    for supplement_price_col in list_df_supplements_prices_columns:
                        if df.loc[row, col] == supplement_price_col:
                            df.at[row, supplement_price_col] = df_supplements_prices.at[0, supplement_price_col]
                            theoretical_price = df.at[row, supplement_price_col]
                            invoiced_price = df.at[row, supplement_col_total]

                            if theoretical_price != invoiced_price:
                                errors_data.append(
                                    {'Package number': df.at[row, 'agencia'], 'Supplement error': supplement_price_col,
                                     'Invoiced price': invoiced_price, 'Theoretical price': theoretical_price,
                                     'Difference': invoiced_price - theoretical_price})

    # Generation DF errors
    df_errors = pd.DataFrame(errors_data)

I know that pandas does not recommend to loop through a DF. But in my case, I have to get to a precise cell to append data, i.e. getting the row and column for this part :

df.at[row, supplement_price_col] = df_supplements_prices.at[0, supplement_price_col]

For 2000 rows, the analysis takes like 4 min (!), which is way too long. So here's my question, I know it is possible to do better, but could you please guide me? I did look up and saw df.apply(lambda row) for example but since I'm using a lot of conditions and loops, it is unclear to me whether I should use this function or not...

inland eagle Oct 18, 2022, 7:57 PM

#

does anyone know how to keep only a certain amount of rows in a data frame

#

like for example i just want to keep the first 10 rows of a df with like over 200 rows

#

what function would i use

digital locust Oct 18, 2022, 8:00 PM

#

@inland eagle maybe df.head(10) ?

timid kiln Oct 18, 2022, 8:01 PM

#

inland eagle what function would i use

do you want to actually delete the rows, or just display the first 10 rows?

inland eagle Oct 18, 2022, 8:03 PM

#

"to a DataFrame that contains the ten most common genres of video games, in descending order"

inland eagle Oct 18, 2022, 8:03 PM

#

timid kiln do you want to actually delete the rows, or just display the first 10 rows?

so i think just display the first 10 rows

#

or like 0-9

inland eagle Oct 18, 2022, 8:05 PM

#

digital locust <@731596561766678528> maybe `df.head(10)` ?

for some reason it won't let me use that

#

AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_644/3463392015.py in <module>
      2 sheesh = yolo.get('title').sort_values(ascending=False)
      3 wut = yolo.get(['title']).assign(count = sheesh).drop(columns='title')
----> 4 most_common_genres = wut.head(10)
      5 most_common_genres

AttributeError: 'DataFrame' object has no attribute 'head'```
this is the error message i am getting

#

ignore the variable and df names lol

timid kiln Oct 18, 2022, 8:11 PM

#

inland eagle ignore the variable and df names lol

Skip the part where you assign most_common_genres and just go straight to wut.head(10)

#

print(wut.head(10))

plush jungle Oct 18, 2022, 8:12 PM

#

I'm trying to retrain stylegan3 starting from one of their pretrained models, and I'm getting this

  File "C:\python\Generative-Adversarial-Networks\stylegan3-main\stylegan3-main\torch_utils\misc.py", line 162, in copy_params_and_buffers
    tensor.copy_(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)
RuntimeError: The size of tensor a (12) must match the size of tensor b (24) at non-singleton dimension 0```

#

this stackoverflow post about the same issue in stylegan2 says it's because my dataset doesn't have the same number of classes as the pretrained model

#

https://github.com/NVlabs/stylegan2-ada-pytorch/issues/156

#

but how could I fix this?

lapis sequoia Oct 18, 2022, 8:13 PM

#

web dev here, looking to start my first ai python project

#

any tips?

plush jungle Oct 18, 2022, 8:14 PM

#

lapis sequoia web dev here, looking to start my first ai python project

what type of ai?

lapis sequoia Oct 18, 2022, 8:14 PM

#

i would prefer nothing to do with data

#

anything else

#

wait nvm

#

any ai is fine

plush jungle Oct 18, 2022, 8:15 PM

#

there's image recognition, image generation, NLP

lapis sequoia Oct 18, 2022, 8:15 PM

#

image recognition sounds cool

plush jungle Oct 18, 2022, 8:15 PM

#

if that's the case, then you should look into the basics of neural nets

#

train a pytorch neural net on the MNIST dataset

lapis sequoia Oct 18, 2022, 8:16 PM

#

okay

plush jungle Oct 18, 2022, 8:17 PM

#

I highly recommend 3blue1brown's youtube video on neural networks

#

https://www.youtube.com/watch?v=aircAruvnKk

YouTube

3Blue1Brown

But what is a neural network? | Chapter 1, Deep learning

What are the neurons, why are there layers, and what is the math underlying it?
Help fund future projects: https://www.patreon.com/3blue1brown
Written/interactive form of this series: https://www.3blue1brown.com/topics/neural-networks

Additional funding for this project provided by Amplify Partners

Typo correction: At 14 minutes 45 seconds, th...

▶ Play video

lapis sequoia Oct 18, 2022, 8:17 PM

#

okay

#

do i just install any mnist dataset?

plush jungle Oct 18, 2022, 8:18 PM

#

lapis sequoia do i just install any mnist dataset?

the mnist dataset refers to the dataset of handwritten 0-9 digits compiled by NIST

#

training a neural net to look at images of handwritten digits and predict what number it is is a great starting project

lapis sequoia Oct 18, 2022, 8:19 PM

#

okay

inland eagle Oct 18, 2022, 8:20 PM

#

timid kiln Skip the part where you assign `most_common_genres` and just go straight to `wut...

im still getting error messages

#

plush jungle Oct 18, 2022, 8:21 PM

#

lapis sequoia do i just install any mnist dataset?

pytorch actually has the mnist dataset as one of the builtin ones
https://pytorch.org/vision/stable/generated/torchvision.datasets.MNIST.html#torchvision.datasets.MNIST

timid kiln Oct 18, 2022, 8:36 PM

#

inland eagle im still getting error messages

That's bizarre. Never seen that issue before. The error even says it's a dataframe.

Try print(type(most_common_genres)) and see what you get

inland eagle Oct 18, 2022, 8:37 PM

#

this is the output:
babypandas.bpd.DataFrame

inland eagle Oct 18, 2022, 8:37 PM

#

timid kiln That's bizarre. Never seen that issue before. The error even says it's a dataf...

my class is using baby pandas which is basically an smaller version of pandas for beginners

timid kiln Oct 18, 2022, 8:37 PM

#

inland eagle my class is using baby pandas which is basically an smaller version of pandas fo...

ok

#

idk what that is tho 😄

inland eagle Oct 18, 2022, 8:38 PM

#

timid kiln idk what that is tho 😄

it is basically pandas

#

like majority of the functions are the same

timid kiln Oct 18, 2022, 8:38 PM

#

sure

#

what did you get for the print/type command?

#

oh sorry you posted it

inland eagle Oct 18, 2022, 8:38 PM

#

timid kiln what did you get for the print/type command?

wdym

#

lol

timid kiln Oct 18, 2022, 8:39 PM

#

maybe the head command isn't available in babypandas

timid kiln Oct 18, 2022, 8:39 PM

#

inland eagle this is the output: ```babypandas.bpd.DataFrame```

https://babypandas.readthedocs.io/en/latest/

#

Take a look in there and see if there's a 'head' command anywhere

#

You could always just make a loop:

for i in range(10):
  print(df[i])

#

That might work for ya.

inland eagle Oct 18, 2022, 8:41 PM

#

IndexError: BabyPandas only accepts Boolean objects when indexing against the data frame; please use .get to get columns, and .loc or .iloc for more complex cases.

timid kiln Oct 18, 2022, 8:42 PM

#

hmmm, just a sec...

#

Give this a try:

print(df.iloc[0:9])

#

idk how limited baby pandas is tho

agile cobalt Oct 18, 2022, 8:46 PM

#

...why would you use that over normal pandas?

inland eagle Oct 18, 2022, 8:55 PM

#

agile cobalt ...why would you use that over normal pandas?

it is because this version of pandas was made for this specific course, so like ya

#

i have no choice but to use it over regular pandas

inland eagle Oct 18, 2022, 9:01 PM

#

timid kiln Give this a try: ```py print(df.iloc[0:9]) ```

OMG thank you that finally worked

timid kiln Oct 18, 2022, 9:02 PM

#

inland eagle OMG thank you that finally worked

I'm glad! Now... can you help me with datetime stuff?

inland eagle Oct 18, 2022, 9:02 PM

#

timid kiln I'm glad! Now... can you help me with `datetime` stuff?

i can try, i am only a beginner. what are you trying to do?

timid kiln Oct 18, 2022, 9:03 PM

#

Trying to replace the day in a datetime if the value of day is < 15.

I have this, but I get an error on df.loc[df['fom'].day

df.loc[df['fom'].day < 15, 'fom'] = df['fom'].apply(lambda dt: dt.replace(day=1))

#

I got the basic syntax from here: https://datagy.io/pandas-conditional-column/

datagy

Nik

Set Pandas Conditional Column Based on Values of Another Column • d...

Lean how to create a Pandas conditional column use Pandas apply, map, loc, and numpy select in order to use values of one or more columns.

next sorrel Oct 18, 2022, 9:38 PM

#

Hello, does anyone here have a good amount of experience with PyTorch?

#

I was just wondering if someone can help me understand how to prepare data for nn.LSTM or nn.LSTMCell, the long short term memory, a recurrent neural network

fringe anvil Oct 18, 2022, 9:44 PM

#

anyone knows why my graph looks like this?

#

instead of this reference image.. i cant figure it out.. been at it for a while

#

fig2,ax2 = plt.subplots(figsize=(5,4))
fig2.patch.set_facecolor("None")
ax2.set_xlabel("Year")
ax2.set_ylabel("Double faults per match")
x,y = df2["year"],df2["player1 double faults"]/df2["player1 total points total"]
ax2.scatter(x,y,alpha=0.5)
ax2.plot(x,y,"-",color="orange")
mpl.style.use("default")

molten forge Oct 18, 2022, 10:04 PM

#

Do anyone have experience in federated learning?

serene scaffold Oct 18, 2022, 10:34 PM

#

molten forge Do anyone have experience in federated learning?

No, what's that? lemon_hyperpleased

novel python Oct 18, 2022, 10:52 PM

#

how do I get all the values with pandas groupby? I want to sort it by 2 columns but I want the rest of the columns to come as a result too, but all I'm getting is a generic GroupBy object in return. Do I necessarily have to use a function with groupby for it to return something?

serene scaffold Oct 18, 2022, 10:54 PM

#

novel python how do I get all the values with pandas groupby? I want to sort it by 2 columns ...

yes. think of the GroupBy as a bag of dataframes, where each dataframe is one of the groups you made. but you can't see them again until you do something that reduces them back to one dataframe.

#

but if you want "all the values", you might rethink why you're using groupby. you usually end up with less data after grouping and doing something with the groups, not the same amount.

novel python Oct 18, 2022, 10:57 PM

#

oooh, I see. It makes total sense now, thanks! Basically, I wanted to turn this dataset:

#

into this one, where it separates by months

#

I thought using groupby would do, but doesn't look like it's the proper solution

serene scaffold Oct 18, 2022, 10:58 PM

#

I think you're looking for pivot_table

novel python Oct 18, 2022, 10:59 PM

#

oh, let me check that

serene scaffold Oct 18, 2022, 11:00 PM

#

if you get stuck, do print(df.sample(10).to_dict('list')) for me and put it in the chat as text (no screenshots), and ping me.

#

also if the .sample(10) part doesn't give you rows with at least two months represented, just do it again until it does.

novel python Oct 18, 2022, 11:01 PM

#

alrighty, thanks a lot, will try it out with the documentation and will reach you if I get stuck

desert oar Oct 18, 2022, 11:17 PM

#

novel python how do I get all the values with pandas groupby? I want to sort it by 2 columns ...

you can iterate over the groupby object, which yields group_label, group_data pairs

#

as stelercus said, usually you don't need to do this

#

but sometimes it comes in handy. i do it now and then

#

note that pivot_table might give funny results with a datetime column

#

obviously you can manually construct a "year-month" column first and use that for pivoting

#

personally i can never remember the arguments for pivot_table so i would probably do .resample followed by .unstack

#

(what is RDD?)

hasty mountain Oct 18, 2022, 11:20 PM

#

Hey guys, when I load an .wav file using librosa.load, what is the unit of measurement for the y axis?
I know that it loads audio data in a time-series, so the x is seconds, but what about the y? Amplitude in decibels?

desert parcel Oct 19, 2022, 1:01 AM

#

I posted a question in #🤡help-banana would be very grateful if anyone could answer 🙏🏻

fringe anvil Oct 19, 2022, 1:45 AM

#

i did some changes to my code, ive lost my orange line. but the data looks better. tho the x axis doesnt give anything proper. can anyone provide pointers?

fig2,ax2 = plt.subplots(figsize=(6,4))
fig2.patch.set_facecolor("None")

dbl_ratio = pd.DataFrame(df2["player1 double faults"]/df2["player1 total points total"]) # good
y_avr = dbl_ratio
x_grpby = df2.groupby("year")

x,y = df2["start date"].values,dbl_ratio

ax2.set_xlabel("Year")
ax2.set_ylabel("Double faults per match")
ax2.scatter(x,y,alpha=0.2) # good
ax2.plot(x_grpby,y_avr,"-",color="orange")
mpl.style.use("default")

#

it would need to look like this

#

i was thinking, using the mean for the data points to draw the orange line.. but nothing of what i use/do works

steady basalt Oct 19, 2022, 7:13 AM

#

serene scaffold No, what's that? <:lemon_hyperpleased:754441879822663811>

Splitting ur stuff up to protect privacy of the data

strong sedge Oct 19, 2022, 7:53 AM

#

desert parcel I posted a question in <#696348847248769074> would be very grateful if anyone co...

Think of it this way, if the plot of prediction and y_true is same, then the prediction is 100% accurate
If it's not the same, you can see when for what values the model predicts wrong answers

#

Basically
U plot 2 graphs on the same figure
1 is test vs preds
2 is test vs y_true

desert parcel Oct 19, 2022, 8:08 AM

#

strong sedge Basically U plot 2 graphs on the same figure 1 is test vs preds 2 is test vs y_t...

Ahh I see thank you

compact star Oct 19, 2022, 8:25 AM

#

I am trying to create a neat implementation in python and in the papers it says that neural networks that haven't improved in x generations will be removed, what is the definition for having not improved and how would I check for it?

strong sedge Oct 19, 2022, 9:41 AM

#

compact star I am trying to create a neat implementation in python and in the papers it says ...

the fitness of a particular species (not genome) doesn't go up after x generations will be removed

compact star Oct 19, 2022, 9:47 AM

#

is the fitness of the species the average fitness of the genomes in that species?

compact star Oct 19, 2022, 9:56 AM

#

strong sedge the fitness of a particular species (not genome) doesn't go up after x generatio...

The species themselves aren't preserved each generation as in even if the same genomes are in that species they might not be in the same one?

strong sedge Oct 19, 2022, 9:58 AM

#

compact star is the fitness of the species the average fitness of the genomes in that species...

yeah

strong sedge Oct 19, 2022, 9:58 AM

#

compact star The species themselves aren't preserved each generation as in even if the same g...

I dont remember how, but I think there is a way to keep track of which genome is from which species

#

watch this guys video
https://www.youtube.com/watch?v=3nbvrrdymF0&ab_channel=NeatAI

YouTube

Neat AI

Neat AI does Neat Speciation

Does speciation make a difference when finding solutions using neural nets ?

Watch the video to find out..

Music :
https://www.bensound.com/

▶ Play video

#

he has a bunch of videos on explaining parts of neat

split drift Oct 19, 2022, 10:12 AM

#

Why does summing array of numbers, using pyarrow is faster than NumPy?
https://stackoverflow.com/questions/74123523/why-does-summing-array-of-number-using-pyarrow-is-faster-than-numpy

Stack Overflow

Why does summing array of numbers, using pyarrow is faster than NumPy?

I noticed that summing array of number is faster, using pyarrow, than NumPy, Why?
Input:
np_days = np.random.randint(0, 100, 100000000, dtype=np.int8)
np_months = np.random.randint(0, 100, 100000000,

compact star Oct 19, 2022, 10:13 AM

#

strong sedge I dont remember how, but I think there is a way to keep track of which genome is...

I have watched his video but he does no mention how he checks if the species has not improved

strong sedge Oct 19, 2022, 10:18 AM

#

compact star I have watched his video but he does no mention how he checks if the species has...

I honestly can't remember how

#

Did u try reading the original research paper ?

compact star Oct 19, 2022, 10:23 AM

#

strong sedge I honestly can't remember how

I have tried reading the original paper but I couldn't find anything where he explicitly mentions it

strong sedge Oct 19, 2022, 10:28 AM

#

compact star I have tried reading the original paper but I couldn't find anything where he ex...

He also has a website with alot of q and a

#

Check that out

compact star Oct 19, 2022, 10:38 AM

#

strong sedge He also has a website with alot of q and a

tysm for ur help

#

how do I read a .ps file?

compact star Oct 19, 2022, 11:15 AM

#

strong sedge He also has a website with alot of q and a

Do you by any chance know java?

#

Because I have the java version of neat and that references the drop off age but I would need help understanding how that works

strong sedge Oct 19, 2022, 11:19 AM

#

compact star Because I have the java version of neat and that references the drop off age but...

No i don't
I would suggest just read it like it's sudo code
The language hardly ever matters for understanding how stuff works

compact star Oct 19, 2022, 11:22 AM

#

ok thank ty for ur help

bleak coyote Oct 19, 2022, 11:31 AM

#

Whats the best way to display confusion matrices?

#

sklearn, but are there better alternatives

celest vine Oct 19, 2022, 12:09 PM

#

empty_df = pd.DataFrame()

for name in sd_eth_list:
    profile_url = df.loc[df['name'].str.contains(name, case=False)]
    
    empty_df.append(profile_url)

The dataframe still remains empty after running the code.
What am I doing wrong?

rich olive Oct 19, 2022, 12:20 PM

#

celest vine ```py empty_df = pd.DataFrame() for name in sd_eth_list: profile_url = df.l...

not supposed to append dataframes. Use .concat() instead. Also is df a temp variable or how is it defined?

#

y = np.linalg.solve(random_img, heart_img)

#

why arent these the same matrix

rich olive Oct 19, 2022, 12:22 PM

#

celest vine ```py empty_df = pd.DataFrame() for name in sd_eth_list: profile_url = df.l...

also dont iterate through dataframes at all just apply. one sec

#

contains_name = df.query('name in sth_ed_list')

#

filtering a dataframe is not done through iteration

#

You have to query or aggregate the data to reduce the size

#

or dimensionality, for aggregating

celest vine Oct 19, 2022, 12:56 PM

#

rich olive not supposed to append dataframes. Use .concat() instead. Also is df a temp vari...

df is a dataframe of usernames

rich olive Oct 19, 2022, 12:57 PM

#

does my code work

celest vine Oct 19, 2022, 1:02 PM

#

rich olive does my code work

Did not work

wheat snow Oct 19, 2022, 1:02 PM

#

native-country      salary
United-States       >50K      7171
?                   >50K       146
Philippines         >50K        61
Germany             >50K        44

i got a lil dataframe here

e= df[['native-country', 'salary']]
highest_earning_country= e[e['salary']== '>50K'].value_counts()

i filtered it to show the country and the salary over 50K

now i want to print out the country with the leading salaray which would be the US

celest vine Oct 19, 2022, 1:02 PM

#

urls = []

for name in sd_eth_list:
    
    if profile_url == (df['name'].str.contains(name, case=False)).any():
        url = df[df['name'].str.contains(name, case=False)]['profileUrl']
        
        urls.append(url)

Tried this as well but did not work as well

rich olive Oct 19, 2022, 1:02 PM

#

celest vine Did not work

send the df

wheat snow Oct 19, 2022, 1:02 PM

#

smh i forgot how to do this

#

via .index() maybe?

rich olive Oct 19, 2022, 1:03 PM

#

wheat snow ``` native-country salary United-States >50K 7171 ? ...

print(highest_earning_country[0])

celest vine Oct 19, 2022, 1:03 PM

#

rich olive send the df


profileUrl    screenName    name    bio    followersCount    friendsCount
0    https://twitter.com/TheSnoopAvatars    TheSnoopAvatars    The Doggies    Enter tha Metaverse with @SnoopDogg x @TheSand...    88130    16
1    https://twitter.com/JulienROMAN13    JulienROMAN13    Julien ROMAN    💵 Investisseur / Youtube 🎬\n\n💸 Finance - Inve...    88768    162
2    https://twitter.com/landz_nft    landz_nft    Landz.io - Minting NOW    The first disruptive Real Estate NFT collectio...    53608    266
3    https://twitter.com/borgetsebastien    borgetsebastien    Sebastien 🏞    Co-Founder & COO of @TheSandboxGame, the open ...    93652    1138
4    https://twitter.com/cryptoamazo    cryptoamazo    Crypto Amazo    Crypto Promoter | Giveaway | DM to sponsor a #...    15743    56

rich olive Oct 19, 2022, 1:05 PM

#

share it the way you constructed it lol or as a csv

#

actually nvm ill figure it out with another df

wheat snow Oct 19, 2022, 1:06 PM

#

rich olive print(highest_earning_country[0])

i get the number 7171 but not the name, i want the OIutput to be

United-States```

celest vine Oct 19, 2022, 1:06 PM

#

rich olive send the df

Basically I have a list of names that I want to find in the dataframe's name column.
I can do it with .isin but that looks for exact match.
I want results the way .str.contains() gives

rich olive Oct 19, 2022, 1:08 PM

#

wheat snow i get the number 7171 but not the name, i want the OIutput to be ``` United-Sta...

try as_index=False for .value_counts()

#

nvm thats not a thing

celest vine Oct 19, 2022, 1:10 PM

#

wheat snow i get the number 7171 but not the name, i want the OIutput to be ``` United-Sta...

highest_earning_country= e[e['salary']== '>50K'].value_counts().reset_index()['native-country']

Try this

wheat snow Oct 19, 2022, 1:14 PM

#

alr

celest vine Oct 19, 2022, 1:15 PM

#

@rich olive where you at help me

rich olive Oct 19, 2022, 1:15 PM

#

one sec

wheat snow Oct 19, 2022, 1:15 PM

#

works 🫂

rich olive Oct 19, 2022, 1:16 PM

#

filtered = df.query(lambda row: name_ele in row.name for name_ele in sd_eth_list)

#

maybe

#

wait you want contains

celest vine Oct 19, 2022, 1:17 PM

#

celest vine Basically I have a list of names that I want to find in the dataframe's name col...

this

rich olive Oct 19, 2022, 1:17 PM

#

filtered = df.query(lambda row: row.name.contains(name_ele) for name_ele in sd_eth_list)

#

case=False

celest vine Oct 19, 2022, 1:19 PM

#

rich olive ```py filtered = df.query(lambda row: row.name.contains(name_ele) for name_ele i...

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_8584\2498179663.py in <module>
----> 1 filtered = df.query(lambda row: row.name.contains(name_ele, case=False) for name_ele in sd_eth_list)

c:\users\user\appdata\local\programs\python\python37\lib\site-packages\pandas\core\frame.py in query(self, expr, inplace, **kwargs)
   4055         if not isinstance(expr, str):
   4056             msg = f"expr must be a string to be evaluated, {type(expr)} given"
-> 4057             raise ValueError(msg)
   4058         kwargs["level"] = kwargs.pop("level", 0) + 1
   4059         kwargs["target"] = None

ValueError: expr must be a string to be evaluated, <class 'generator'> given

rich olive Oct 19, 2022, 1:21 PM

#

yeah that makes sense. sorry Im pretty new too. can fix tho one sec

#

filtered = df.apply(lambda row: row.name.contains(name_ele) for name_ele in sd_eth_list)

fossil ivy Oct 19, 2022, 1:26 PM

#

I have a multiindexed dataframe like:

                   Duration  Duration/MW       Cost  Cost (m)  Times Offshore Exceeded  Times Vessel Full
Vessel Start Date                                                                                        
JUV    2022-01-01    34.688        4.818  3.983e+06     3.983                      1.5                0.0
       2022-01-02    33.296        4.624  3.839e+06     3.839                      1.4                0.0
       2022-01-03    34.354        4.771  3.948e+06     3.948                      1.6                0.1
       2022-01-04    30.342        4.214  3.534e+06     3.534                      1.5                0.1
       2022-01-05    35.092        4.874  4.025e+06     4.025                      1.6                0.1
       2022-01-06    31.342        4.353  3.637e+06     3.637                      1.4                0.2
       2022-01-07    30.100        4.181  3.509e+06     3.509                      1.3                0.2

WTIV   2022-01-01    34.688        4.818  3.983e+06     3.983                      1.5                0.0
       2022-01-02    33.296        4.624  3.839e+06     3.839                      1.4                0.0
       2022-01-03    34.354        4.771  3.948e+06     3.948                      1.6                0.1
       2022-01-04    30.342        4.214  3.534e+06     3.534                      1.5                0.1
       2022-01-05    35.092        4.874  4.025e+06     4.025                      1.6                0.1
       2022-01-06    31.342        4.353  3.637e+06     3.637                      1.4                0.2
       2022-01-07    30.100        4.181  3.509e+06     3.509                      1.3                0.2

I need to create a boxplot of Duration for each Vessel/ Start Date combination. Ive been struggling to make it work could someone help me? It would be much appreciated

#

whoops wrong df, I created this for the time-series analysis of duration and costs but they are the mean of 10 runs for each pair Vessel, Start Date

#

I have a long ~7300x7 dataframe, where each entry Vessel/ Start Date is separate 20 times, with the same index

celest vine Oct 19, 2022, 1:28 PM

#

rich olive ```py filtered = df.apply(lambda row: row.name.contains(name_ele) for name_ele i...

AttributeError: 'str' object has no attribute 'contains'

fossil ivy Oct 19, 2022, 1:29 PM

#

fossil ivy I have a long `~7300x7` dataframe, where each entry `Vessel/ Start Date` is sepa...

that looks like

...
723   WTIV 2022-12-28    10.750  ...     2.248                       0                  0
724    JUV 2022-12-29    43.333  ...     4.876                       2                  0
725   WTIV 2022-12-29     6.833  ...     1.647                       0                  0
726    JUV 2022-12-30    43.667  ...     4.910                       2                  0
727   WTIV 2022-12-30    12.083  ...     2.452                       0                  0
728    JUV 2022-12-31    47.917  ...     5.349                       2                  0
729   WTIV 2022-12-31     8.000  ...     1.826                       0                  0
0      JUV 2022-01-01    35.375  ...     4.054                       1                  0
1     WTIV 2022-01-01     6.500  ...     1.596                       0                  0
2      JUV 2022-01-02    33.083  ...     3.817                       1                  0
3     WTIV 2022-01-02    10.250  ...     2.171                       0                  0
4      JUV 2022-01-03    30.875  ...     3.589                       1                  0
5     WTIV 2022-01-03     9.250  ...     2.018                       0                  0
6      JUV 2022-01-04    10.917  ...     1.528                       0                  1
...

rich olive Oct 19, 2022, 1:32 PM

#

celest vine ```py AttributeError: 'str' object has no attribute 'contains' ```

contains isnt a python method lol use in

rich olive Oct 19, 2022, 1:32 PM

#

rich olive ```py filtered = df.query(lambda row: name_ele in row.name for name_ele in sd_et...

.

#

with df.apply() tho

rich olive Oct 19, 2022, 1:34 PM

#

fossil ivy I have a long `~7300x7` dataframe, where each entry `Vessel/ Start Date` is sepa...

you have 7300 x values?

fossil ivy Oct 19, 2022, 1:35 PM

#

rich olive you have 7300 x values?

No I basically have 20 times the identical 729x7 dataframe

#

Just appended to each other:
simulation(strategy) generates one of those 729x7 dfs

full_results = []

    for i in range(0, 10):
        print("Run", i+1, "of 10")
        full_results.append(simulation(strategy))

#

I managed to get a boxplot for each vessel, but using the entire year as data for each boxplot. Instead I need to have a boxplot for each day of the year per vessel

#

I don't quite now how to implement the Date still

rich olive Oct 19, 2022, 1:37 PM

#

you can datetime or just manually parse the date

#

whats the difference doing it year vs day

fossil ivy Oct 19, 2022, 1:38 PM

#

my research investigates the impact of weather seasonality on offshore wind farm decommissioning project performance

rich olive Oct 19, 2022, 1:38 PM

#

okay what are you trying to boxplot lol

fossil ivy Oct 19, 2022, 1:38 PM

#

the box and whisker for the duration per day

#

Because the time-series graph I create takes the average of 20 runs, so very high values and very low values are not considered

rich olive Oct 19, 2022, 1:39 PM

#

so for each day of the year, you want the spread of duration across all vessels

fossil ivy Oct 19, 2022, 1:39 PM

#

for each day of the year, I want the spread of duration per vessel

#

Because I want to investigate if one of the vessels is more subject to weather uncertainties/ impacts

rich olive Oct 19, 2022, 1:40 PM

#

you cant boxplot that, its 3 dimensional

fossil ivy Oct 19, 2022, 1:40 PM

#

I found something like this, that's how I imagined it but I can't get it to work: https://stackoverflow.com/questions/46603823/boxplot-with-multiindex

Stack Overflow

Boxplot with multiindex

Let's say i have a Dataframe with columns as Multiindex. For example:

a = pd.DataFrame(index=range(10),
columns=pd.MultiIndex.from_product(
iterables=[['...

#

The graph at the bottom of the thread

#

just with the year representing my vessels, and the a/b the start date on the x axis

rich olive Oct 19, 2022, 1:41 PM

#

so you want each vessel as a sub-hierarchy to each day in a boxplot

fossil ivy Oct 19, 2022, 1:41 PM

#

yes

#

Im rather new/ unknowledgable in coding so yeah... quite tough to get behind it

rich olive Oct 19, 2022, 1:42 PM

#

me too so we'll see if I can even be any help

fossil ivy Oct 19, 2022, 1:42 PM

#

wait a sec... Isn't my structure pretty much identical to the df in the thread?

#

   Vessel  Start Date    Duration
717   WTIV 2022-12-25    12.000  ...     2.439                       0                  0
718    JUV 2022-12-26    47.333  ...     5.289                       1                  0
719   WTIV 2022-12-26    10.000  ...     2.133                       0                  0
720    JUV 2022-12-27    45.917  ...     5.143                       2                  0
721   WTIV 2022-12-27    10.500  ...     2.210                       0                  0

#

Year in his example would be my Vessel, Text would be my Start Date and data would be duration?

rich olive Oct 19, 2022, 1:45 PM

#

sure. I imagine most dfs would apply. Im reading through it now but pivoting is hard lol

fossil ivy Oct 19, 2022, 1:45 PM

#

Pivoting is a bitch yeah

celest vine Oct 19, 2022, 1:45 PM

#

rich olive contains isnt a python method lol use in

I ran this code

filtered = df.apply(lambda row: row.name.contains(name_ele, case=False) for name_ele in sd_eth_list)

fossil ivy Oct 19, 2022, 1:45 PM

#

fossil ivy I have a multiindexed dataframe like: ```py Duration Duratio...

I mean I did it here, but couldn't get anywhere with it for the boxplot, works perfectly for the time-series

celest vine Oct 19, 2022, 1:46 PM

#

Fuck this shit man! I think I should open a small grocery shop

fossil ivy Oct 19, 2022, 1:46 PM

#

celest vine Fuck this shit man! I think I should open a small grocery shop

how come

rich olive Oct 19, 2022, 1:47 PM

#

celest vine I ran this code ```py filtered = df.apply(lambda row: row.name.contains(name_ele...

is contains a method? i cant find documentation lol

celest vine Oct 19, 2022, 1:47 PM

#

rich olive is contains a method? i cant find documentation lol

You gave that code

rich olive Oct 19, 2022, 1:47 PM

#

filtered = df.apply(lambda row: name_ele in row for name_ele in sd_eth_list)

#

yeah because I assumed you were using .contains() correctly lmao

fossil ivy Oct 19, 2022, 1:48 PM

#

Soooooooooo yeaaah

#

Looks like too much data for this lol

celest vine Oct 19, 2022, 1:48 PM

#

rich olive yeah because I assumed you were using .contains() correctly lmao

Means you are saying I am dumb, right? Yes I am but don't say that directly

rich olive Oct 19, 2022, 1:49 PM

#

Im saying thats why i put it in my code. you said dumb

#

try the above

fossil ivy Oct 19, 2022, 1:50 PM

#

fossil ivy Soooooooooo yeaaah

Do you guys reckon it would help if I created an individual plot for each month?

rich olive Oct 19, 2022, 1:51 PM

#

fossil ivy Soooooooooo yeaaah

i mean, that looks about right

fossil ivy Oct 19, 2022, 1:51 PM

#

it looks pretty much like a greyed out version of my time-series

rich olive Oct 19, 2022, 1:51 PM

#

i just wouldnt use a boxplot

fossil ivy Oct 19, 2022, 1:51 PM

#

#

What would you suggest otherwise?

rich olive Oct 19, 2022, 1:51 PM

#

hm one sec

fossil ivy Oct 19, 2022, 1:52 PM

#

Something like this would be nice as well, but probably same story as the boxplot

rich olive Oct 19, 2022, 1:53 PM

#

Maybe heatmap the duration on a 2x2 vessel x day and examine the spread seperately

fossil ivy Oct 19, 2022, 1:53 PM

#

bless you

rich olive Oct 19, 2022, 1:53 PM

#

lmao np

fossil ivy Oct 19, 2022, 1:54 PM

#

I meant bless you like what are those words lol

#

looking at it now though

rich olive Oct 19, 2022, 1:54 PM

#

haha oh yeah theyre not words just had a micro-seizure

fossil ivy Oct 19, 2022, 1:54 PM

#

fair enough loool

#

I see what you mean by heatmap

#

but how can I imagine the structure there?

#

Would you mean duration on y axis, date on x axis

#

and then heatmap per vessel

rich olive Oct 19, 2022, 1:56 PM

#

one axis of heatmap is day the other is vessl so each square is a vessl on a day, trends along each axis, colour is avg duration

fossil ivy Oct 19, 2022, 1:56 PM

#

but then that would not model spread (?)

rich olive Oct 19, 2022, 1:57 PM

#

not spread per vessel per day. Thats what I meant by examine it seperately like per vessl or per day

fossil ivy Oct 19, 2022, 1:57 PM

#

aaah

rich olive Oct 19, 2022, 1:57 PM

#

but youll be able to see spread of avg duration across year and vessel

fossil ivy Oct 19, 2022, 1:57 PM

#

fossil ivy Soooooooooo yeaaah

Yeah I think this one is so messy because it has one pair duration spread/vessel for every single day of the year

rich olive Oct 19, 2022, 1:57 PM

#

and then use other examinations to highlight areas of interest on the heatmap

fossil ivy Oct 19, 2022, 1:58 PM

#

Maybe if I were to combine the months instead of doing a boxplot for every single day that could work

rich olive Oct 19, 2022, 1:58 PM

#

its still 20 vessels so whatever you think 240 boxes looks like

fossil ivy Oct 19, 2022, 1:58 PM

#

its 2 vessel

rich olive Oct 19, 2022, 1:58 PM

#

oh lol

fossil ivy Oct 19, 2022, 1:59 PM

#

JUV and WTIV

#

yeah

rich olive Oct 19, 2022, 2:00 PM

#

...you could do a 3d heatmap from an offest angle with cell height and whiskers showing spread

fossil ivy Oct 19, 2022, 2:00 PM

#

yeah the bot is right

#

What the hell did you mean lol

rich olive Oct 19, 2022, 2:00 PM

#

one sec

fossil ivy Oct 19, 2022, 2:00 PM

#

uuh 3d heatmap looking nice tho

rich olive Oct 19, 2022, 2:00 PM

#

https://images.app.goo.gl/CKMXCYY2scobrfVb6

#

actually the whiskers would be nonsensical

#

heatmap doesnt make sense with two of one variable

#

can you just dual candlestick chat it

fossil ivy Oct 19, 2022, 2:04 PM

#

ayo

#

calm down with words

#

im out here googling their meaning nonstop haha

#

Wouldn't dual candlestick essentially be dual boxplot tho

rich olive Oct 19, 2022, 2:05 PM

#

like a stock chart but with the grouped bars like in your SO example

#

lmao yeah

#

im dumb

fossil ivy Oct 19, 2022, 2:05 PM

#

yeah nah I think a boxplot would be the best approach here

#

because it is intended to visualize variability isnt it

rich olive Oct 19, 2022, 2:07 PM

#

yeah i guess you have to reduce the timeframe if you wanna see spread

fossil ivy Oct 19, 2022, 2:07 PM

#

yeah I might just do one separately for each month

#

Then I could derive If you use vessel x in month y, the project performance is significantly uncertain some stuff like that

normal hazel Oct 19, 2022, 2:48 PM

#

Hi

#

Anyone have used dbt?

serene scaffold Oct 19, 2022, 2:49 PM

#

normal hazel Anyone have used dbt?

what's that?

fossil ivy Oct 19, 2022, 2:55 PM

#

rich olive yeah i guess you have to reduce the timeframe if you wanna see spread

Oh god have you heard of pandasgui??

normal hazel Oct 19, 2022, 2:55 PM

#

.getdbt

normal hazel Oct 19, 2022, 2:56 PM

#

serene scaffold what's that?

It's tools writing select statement inside datawarehouse

pure moat Oct 19, 2022, 4:38 PM

#

guys im getting this error could someone help

#

if query[0] == activationWord:
TypeError: 'builtin_function_or_method' object is not subscriptable

serene scaffold Oct 19, 2022, 4:43 PM

#

pure moat guys im getting this error could someone help

can you do print(query), so we know what it is?

pure moat Oct 19, 2022, 4:43 PM

#

sure

pure moat Oct 19, 2022, 4:43 PM

#

serene scaffold can you do `print(query)`, so we know what it is?

<built-in method split of str object at 0x000001DE82E003F0>

#

this is what got printed

serene scaffold Oct 19, 2022, 4:44 PM

#

pure moat <built-in method split of str object at 0x000001DE82E003F0>

so, somewhere along the way, you tried to use the split method without calling it. can you show where query = is defined?

jolly knoll Oct 19, 2022, 4:44 PM

#

hello i got a score of 1.0 accuracy for my kNN for k=1 to k=15, what am i doing wrong?

arctic wedgeBOT Oct 19, 2022, 4:45 PM

#

Hey @pure moat!

It looks like you tried to attach a Python file - please use a code-pasting service such as https://paste.pythondiscord.com

serene scaffold Oct 19, 2022, 4:45 PM

#

jolly knoll hello i got a score of 1.0 accuracy for my kNN for k=1 to k=15, what am i doing ...

1.0 is 100%, not 1%.

jolly knoll Oct 19, 2022, 4:45 PM

#

serene scaffold 1.0 is 100%, not 1%.

yep

#

exactly

serene scaffold Oct 19, 2022, 4:46 PM

#

look at query = parseCommand().lower().split. you forgot the () at the end of split

pure moat Oct 19, 2022, 4:46 PM

#

O

#

TYYY

serene scaffold Oct 19, 2022, 4:47 PM

#

jolly knoll yep

so you think your model can only have 100% accuracy if you've done something incorrectly? we can't guess what we did wrong if we don't know what you did.

serene scaffold Oct 19, 2022, 4:48 PM

#

pure moat TYYY

no problem. thanks for helping me to help you 👍🏻

jolly knoll Oct 19, 2022, 4:50 PM

#

serene scaffold so you think your model can only have 100% accuracy if you've done something inc...

it's normal for my kNN model to get 100%?

serene scaffold Oct 19, 2022, 4:52 PM

#

jolly knoll it's normal for my kNN model to get 100%?

it's possible, if you have enough training data and there aren't very many/any outliers.

jolly knoll Oct 19, 2022, 4:54 PM

#

ahh i've been told to be worried if my model has 100% accuracy haha. my dimensions are (247165, 19) for my training set. i scaled all numerical datasets before inserting it into a kNN

serene scaffold Oct 19, 2022, 5:02 PM

#

usually, if you have 100%, it means that your model is very dependent on the training data, and wouldn't perform well in real situations. but I don't know what your model is intended to do.

wet valve Oct 19, 2022, 6:50 PM

#

Hi I’m new to data science so what all modules should I learn in python for data science 🙂

#

i know pandas and numpy

pseudo basin Oct 19, 2022, 7:06 PM

#

wet valve Hi I’m new to data science so what all modules should I learn in python for data...

I'm not sure what data science does actually, but I'm doing big data analysis & AI master right now.

#

the first block, I'm learning Machine learning and AI

#

basically, in ML, we make use of matplot to plot graph and sklearn to do the heavy-lifting. try to explain how dataframe says

wet valve Oct 19, 2022, 7:10 PM

#

thanks

serene scaffold Oct 19, 2022, 7:21 PM

#

wet valve Hi I’m new to data science so what all modules should I learn in python for data...

"learning modules" isn't a viable strategy for actually learning data science. you have to understand the theory, and libraries in the data science ecosystem are not designed to gradually teach that to you as you use them. Try one of the books on our website

#

!resources data science

arctic wedgeBOT Oct 19, 2022, 7:21 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

wet valve Oct 19, 2022, 7:23 PM

#

serene scaffold "learning modules" isn't a viable strategy for actually learning data science. y...

sure I’ll check em out

rare socket Oct 19, 2022, 8:46 PM

#

I am trying to manually randomly change the weights and biases in my neural network but the only way I found to access them is to loop through them so this is what I did but it is not changing the weights and biases at all. This is very cumbersome, is there a easier way to index through it so that it actually changes the weights and biases?

#

using pytorch

restive python Oct 19, 2022, 8:52 PM

#

Basically I have around 1,250,000 photos uploaded on boto3 and am trying to make an X file with all the rgb values, but colab takes way too long to download the files and turn them into np arrays
anyone have a better idea?

austere swift Oct 19, 2022, 9:01 PM

#

rare socket I am trying to manually randomly change the weights and biases in my neural netw...

You’re changing j (the variable) but not the value in the parameter

#

You should enumerate in the loops to get the indexes along with the values, then change the value at that index to the new value

#

It would be something like param[idx1][idx2] = j

serene scaffold Oct 19, 2022, 9:04 PM

#

rare socket I am trying to manually randomly change the weights and biases in my neural netw...

Please don't ask people to read screenshots of text

#

!code

arctic wedgeBOT Oct 19, 2022, 9:04 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

plush jungle Oct 19, 2022, 9:08 PM

#

I'm trying to retrain stylegan3, but I keep running into the following error

#

this command

python train.py --outdir=~/training-runs --cfg=stylegan3-t --data=datasets/ffhq_control.zip --gpus=1 --batch=4 --gamma=8.2 --mirror=1 --workers=1 --snap=50 --tick=4 --cbase=16384 --resume=C:\python\Generative-Adversarial-Networks\stylegan3-main\stylegan3-main\pretrained_models\stylegan3-t-ffhqu-1024x1024.pkl```

produces this error

File "C:\python\Generative-Adversarial-Networks\stylegan3-main\stylegan3-main\training\training_loop.py", line 162, in training_loop
misc.copy_params_and_buffers(resume_data[name], module, require_all=False)
File "C:\python\Generative-Adversarial-Networks\stylegan3-main\stylegan3-main\torch_utils\misc.py", line 163, in copy_params_and_buffers
tensor.copy_(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)
RuntimeError: The size of tensor a (16) must match the size of tensor b (32) at non-singleton dimension 1```

#

I'm resuming from a model trained on 1024x1024 images from the ffhq dataset

rare socket Oct 19, 2022, 9:09 PM

#

austere swift It would be something like param[idx1][idx2] = j

I see thanks

plush jungle Oct 19, 2022, 9:10 PM

#

and my dataset, "ffhq_control.zip" is 12 images from that same dataset

austere swift Oct 19, 2022, 9:10 PM

#

plush jungle this command ``` python train.py --outdir=~/training-runs --cfg=stylegan3-t --da...

Are your images the same size as the ones it was trained on before

plush jungle Oct 19, 2022, 9:11 PM

#

austere swift Are your images the same size as the ones it was trained on before

yes, they're literally taken from the same dataset because I was getting this same issue on my own images

#

so I figured if I can't even retrain it on the same dataset it's not an issue with my images

#

it also says this

#

Output directory:    ~/training-runs\00006-stylegan3-t-ffhq_control-gpus1-batch32-gamma8.2
Number of GPUs:      1
Batch size:          32 images
Training duration:   25000 kimg
Dataset path:        datasets/ffhq_control.zip
Dataset size:        12 images
Dataset resolution:  1024
Dataset labels:      False
Dataset x-flips:     True

Creating output directory...
Launching processes...
Loading training set...

Num images:  24
Image shape: [3, 1024, 1024]
Label shape: [0]```

desert oar Oct 19, 2022, 9:13 PM

#

do they provide instructions for running it?

plush jungle Oct 19, 2022, 9:13 PM

#

yeah

#

https://github.com/NVlabs/stylegan3

GitHub

GitHub - NVlabs/stylegan3: Official PyTorch implementation of Style...

Official PyTorch implementation of StyleGAN3. Contribute to NVlabs/stylegan3 development by creating an account on GitHub.

#

in the section of the readme under Preparing Datasets and Training

#

my working theory is that it's something to do with my class labels (or lack thereof)

desert oar Oct 19, 2022, 9:15 PM

#

plush jungle in the section of the readme under Preparing Datasets and Training

so you ran python dataset_tool.py first?

plush jungle Oct 19, 2022, 9:15 PM

#

yeah, like this

#

python dataset_tool.py --source=C:\python\Generative-Adversarial-Networks\stylegan3-main\stylegan3-main\datasets\ffhq_control --dest=datasets/ffhq_control.zip ```

desert oar Oct 19, 2022, 9:15 PM

#

what happens if you follow one of their examples exactly as written? e.g. the metfaces example

#

"Fine-tune StyleGAN3-R for MetFaces-U using 1 GPU, starting from the pre-trained FFHQ-U pickle."

plush jungle Oct 19, 2022, 9:16 PM

#

don't you need to download the entire metfaces dataset for that?

#

that's a terabyte at least I think

#

70,000 images or so

#

which is why I instead tried it with the 12 images I downloaded from ffhq

#

you think I'm missing a config file that comes with those datasets?

desert oar Oct 19, 2022, 9:21 PM

#

oh i didn't realize it was a huge dataset

#

maybe there's a sample you can download

#

oh i see, the ffhq set is more manageable

#

seems like that should work too though

plush jungle Oct 19, 2022, 9:23 PM

#

from the readme:

restive python Oct 19, 2022, 9:23 PM

#

anyone know how to use boto3

#

having a little trouble w it rn

plush jungle Oct 19, 2022, 9:23 PM

#

Datasets are stored as uncompressed ZIP archives containing uncompressed PNG files and a metadata file dataset.json for labels. Custom datasets can be created from a folder containing images; see python dataset_tool.py --help for more information. Alternatively, the folder can also be used directly as a dataset, without running it through dataset_tool.py first, but doing so may lead to suboptimal performance.```

desert oar Oct 19, 2022, 9:23 PM

#

makes sense

#

debugging other people's code is always difficult

#

hard to know where the breakdown is... if it were me, i'd probably file a bug report

plush jungle Oct 19, 2022, 9:25 PM

#

this person seemed to have the same issue

#

https://stackoverflow.com/questions/71103106/stylegan3-stylegan2-ada-tensor-mismatch-error-for-every-256-or-512-flickr-relate

Stack Overflow

stylegan3 stylegan2-ada tensor mismatch error for every 256 or 512 ...

Anyone having the same tensor size mismatch when trying finetuning on ffhq,ffhqu or celebahq models with stylegan3 (and with --cfg=stylegan2)?
With afhqv2 and metfaces I had no problems at 512 and...

desert oar Oct 19, 2022, 9:25 PM

#

restive python anyone know how to use boto3

this is a good question for a help channel, see #❓｜how-to-get-help . the channel #data-science-and-ml is for something specific and not really related to boto3
don't "ask to ask". the #❓｜how-to-get-help information (as well as the popup when you ask a new question) provides detailed instructions for asking answerable questions. read them.

plush jungle Oct 19, 2022, 9:25 PM

#

so I added this --cbase=16384 argument

#

per the answer

restive python Oct 19, 2022, 9:26 PM

#

desert oar 1) this is a good question for a help channel, see <#704250143020417084> . the c...

bru i asked this in help

#

and they told me to come here

desert oar Oct 19, 2022, 9:26 PM

#

restive python and they told me to come here

then state your question in detail and maybe someone can help. but boto3 is the client library for AWS. this channel is about data science.

#

please do read the guide on asking good questions

desert oar Oct 19, 2022, 9:27 PM

#

plush jungle so I added this --cbase=16384 argument

yeah... i wonder if there's some other magic required number here. the cbase argument is "capacity multiplier" and i have no idea what that means

#

apparently your cbase etc. options need to match the pre-trained model

restive python Oct 19, 2022, 9:27 PM

#

i have 1.2 mil photos on s3 and am trying to turn them into a large csv file. Anyone have any idea on how to pickle these, because right now I am downloading each one and it's going to take around 50 days

desert oar Oct 19, 2022, 9:28 PM

#

restive python i have 1.2 mil photos on s3 and am trying to turn them into a large csv file. An...

csv? how do you expect to turn a photo into csv data?

restive python Oct 19, 2022, 9:28 PM

#

pixel data

desert oar Oct 19, 2022, 9:28 PM

#

what are you doing with them? why do you need csv data?

restive python Oct 19, 2022, 9:28 PM

#

into a huge np array

desert oar Oct 19, 2022, 9:28 PM

#

...do you see how withholding information in your question wastes both yours and everyone else's time?

#

now it is (kind of) a data science question

plush jungle Oct 19, 2022, 9:29 PM

#

you're trying to make a single numpy array of 1.2 million images?

desert oar Oct 19, 2022, 9:29 PM

#

why do you need it all in a huge numpy array?

#

that seems ill-advised and like an "XY" problem

restive python Oct 19, 2022, 9:29 PM

#

desert oar why do you need it all in a huge numpy array?

i don't im trying to find some better way to do it

plush jungle Oct 19, 2022, 9:29 PM

#

Is there even enough ram on any computer to do that?

restive python Oct 19, 2022, 9:29 PM

#

that's why im asking

restive python Oct 19, 2022, 9:29 PM

#

plush jungle Is there even enough ram on any computer to do that?

that's the problem

#

idk how to manage this much data

desert oar Oct 19, 2022, 9:29 PM

#

restive python i don't im trying to find some better way to do it

what are you actually trying to do? don't force people to interrogate you for information.

plush jungle Oct 19, 2022, 9:29 PM

#

restive python that's the problem

I'm with salt rock on this one, tell us exactly what the end goal is

restive python Oct 19, 2022, 9:31 PM

#

i'm making an animal recognition software and I have a harddrive with a lot of trail cam footage that I'm trying to make into something I can train a model with

#

I'm new to this

#

and am trying to learn

#

I've just never made something with this much data

desert oar Oct 19, 2022, 9:33 PM

#

restive python I've just never made something with this much data

normally you don't try to load all this data at once, and normally you don't need to load it into one big numpy array. ML frameworks like pytorch have some kind of "data loader" mechanism, and usually that includes ready-to-use functionality for working with images.

#

it's sometimes enticing to try to DIY things, but with a relatively large amount of data, and the relatively sophisticated models required to do ML on it, then you should probably just use a framework and spare yourself the difficulty

restive python Oct 19, 2022, 9:34 PM

#

desert oar normally you don't try to load all this data at once, and normally you don't nee...

i'll look into this

#

thank you so much ❤️

plush jungle Oct 19, 2022, 9:34 PM

#

restive python i'm making an animal recognition software and I have a harddrive with a lot of t...

is the footage labelled? cause if it's not labelled it probably won't be useful as a training dataset for animal classification

restive python Oct 19, 2022, 9:35 PM

#

plush jungle is the footage labelled? cause if it's not labelled it probably won't be useful...

it is!

plush jungle Oct 19, 2022, 9:35 PM

#

restive python it is!

wow, nice

desert oar Oct 19, 2022, 9:35 PM

#

restive python thank you so much ❤️

remember: if you stated your actual question first, then you'd have gotten this answer a lot faster. see: https://xyproblem.info/

The XY Problem

Asking about your attempted solution rather than your actual problem

fringe anvil Oct 19, 2022, 9:36 PM

#

so uh, im trying to do this (first image) but im getting this (second image)

here's the code

fig2,ax2 = plt.subplots(figsize=(6,4))
fig2.set_facecolor("None")

dbl_ratio = pd.DataFrame(df2["player1 double faults"]/df2["player1 total points total"]) # good
dbl_ratio_avr = dbl_ratio
year_grpby = df2.groupby("year").max()

x,y = df2["start date"],dbl_ratio

ax2.set_xlabel("Year")
ax2.set_ylabel("Double faults per match")
ax2.scatter(x,y,alpha=0.3) # good
ax2.plot(year_grpby,dbl_ratio,"-",color="orange")
mpl.style.use("default")

desert oar Oct 19, 2022, 9:38 PM

#

fringe anvil so uh, im trying to do this (first image) but im getting this (second image) he...

what is the data type of year? and what is the data type of start date?

fringe anvil Oct 19, 2022, 9:39 PM

#

columns in a data frame

#

pd.read_csv

#

desert oar Oct 19, 2022, 9:40 PM

#

fringe anvil columns in a data frame

ok, but what data types? are they strings? datetime? something else?

#

those look like strings

#

i highly recommend instead converting start date to a proper datetime column

#

!d pandas.to_datetime

arctic wedgeBOT Oct 19, 2022, 9:40 PM

#

pandas.to\_datetime


pandas.to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False, utc=None, format=None, exact=True, unit=None, infer_datetime_format=False, origin='unix', cache=True)```
Convert argument to datetime.

This function converts a scalar, array-like, [`Series`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html#pandas.Series "pandas.Series") or [`DataFrame`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html#pandas.DataFrame "pandas.DataFrame")/dict-like to a pandas datetime object.

fringe anvil Oct 19, 2022, 9:41 PM

#

year is pandas.core.series.Series and same for start date

desert oar Oct 19, 2022, 9:41 PM

#

fringe anvil year is pandas.core.series.Series and same for start date

every pandas series has its own "dtype" which describes the data stored in it. that is what i'm asking about

#

df['start date'].dtype

fringe anvil Oct 19, 2022, 9:41 PM

#

oh sorry, i just started my course lol

#

desert oar Oct 19, 2022, 9:43 PM

#

if you convert from strings to a proper datetime type, then you don't need the year column at all. you can do something like df.resample('AS', on='start date').max()

#

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html

desert oar Oct 19, 2022, 9:43 PM

#

fringe anvil

'O' usually means "strings"

#

also... what is year_groupby supposed to be the max of? right now your code takes the max of all columns

#

furthermore that line looks like a mean, not a max (and a smoothed one at that)

#

df2['start date'] = pd.to_datetime(df2['start date'])

x = df2["start date"]
y = dbl_ratio
y_year_mean = df2.resample('AS', on='start date').mean()

this might get you started, but i think you are missing some other things here

#

please do read the docs and not just copy my code though

novel python Oct 19, 2022, 9:46 PM

#

@desert oar not sure if you'll remember me from yesterday, but only got time now to test again. Whenever you're free let me know and I'll send you the sample and the pivot table I created.

fringe anvil Oct 19, 2022, 9:48 PM

#

desert oar please do read the docs and not just copy my code though

ive been on this for 2 days in a row, ive tried pretty much everything of what they provide in the lecture notebook. scatter works, but i cant get the year on x to show properly and also my mean for the line isnt working

fringe anvil Oct 19, 2022, 9:49 PM

#

desert oar please do read the docs and not just copy my code though

ill be reading now. ill try to come up with something. thanks a lot

#

lmfao, getting somewhere i guess

desert oar Oct 19, 2022, 9:56 PM

#

fringe anvil ive been on this for 2 days in a row, ive tried pretty much everything of what t...

did they not show you how to work with datetime data in pandas?

fringe anvil Oct 19, 2022, 9:57 PM

#

desert oar did they not show you how to work with datetime data in pandas?

its a bootcamp, so im doing my best to follow

desert oar Oct 19, 2022, 9:59 PM

#

fringe anvil lmfao, getting somewhere i guess

that looks like more like a daily maximum. maybe 'AS' was wrong, but that's what i thought i saw in the docs for "start of year"

desert oar Oct 19, 2022, 10:00 PM

#

fringe anvil ive been on this for 2 days in a row, ive tried pretty much everything of what t...

well you did max() in your code, not mean()!

fringe anvil Oct 19, 2022, 10:01 PM

#

desert oar well you did `max()` in your code, not `mean()`!

thats the last line of code i wrote, i tried anything that i could remember. sum() mean() max() etc

desert oar Oct 19, 2022, 10:01 PM

#

fringe anvil thats the last line of code i wrote, i tried anything that i could remember. sum...

well why would you do max when you meant mean?

fringe anvil Oct 19, 2022, 10:01 PM

#

in my head its clear what i want to do, but putting it into code, doesnt look like its working much

desert oar Oct 19, 2022, 10:02 PM

#

fringe anvil in my head its clear what i want to do, but putting it into code, doesnt look li...

well answer the practical question here. you want the yearly mean, right? so why would you use anything other than mean?

#

show me the code you used for the messed up chart you just posted above

fringe anvil Oct 19, 2022, 10:06 PM

#

desert oar well answer the practical question here. you want the yearly mean, right? so why...

it might seem simple to you, but to me, it wasnt working. ive tried 100s of iteration to the code. for some reasons, its just not clicking for this exercise. its really the first one where im having this much trouble. im behind, this is the first workshop, theres a second one. and i only have a few days left to upload it to github. i have a full time job, i wish i could take the time to dig into every single documentation, but right now its not possible

fringe anvil Oct 19, 2022, 10:07 PM

#

desert oar show me the code you used for the messed up chart you just posted above

fig2,ax2 = plt.subplots(figsize=(6,4))
fig2.set_facecolor("None")

df2['start date'] = pd.to_datetime(df2['start date'])

dbl_ratio = pd.DataFrame(df2["player1 double faults"]/df2["player1 total points total"]) # good
dbl_ratio_avr = dbl_ratio

x = df2["start date"]
y = dbl_ratio

ax2.set_xlabel("Year")
ax2.set_ylabel("Double faults per match")
ax2.scatter(x,y,alpha=0.3) # good
ax2.plot(x,dbl_ratio,"-",color="orange")
mpl.style.use("default")

desert oar Oct 19, 2022, 10:11 PM

#

fringe anvil it might seem simple to you, but to me, it wasnt working. ive tried 100s of iter...

in the future, i suggest asking sooner! if you don't understand an error message, trying other random stuff usually isn't a good approach

desert oar Oct 19, 2022, 10:12 PM

#

fringe anvil ```py fig2,ax2 = plt.subplots(figsize=(6,4)) fig2.set_facecolor("None") df2['st...

you forgot to actually take any kind of average:

dbl_ratio_avr = dbl_ratio

use the resample code i showed you

#

i totally understand the stress of being short on time and not understanding what's going on

fringe anvil Oct 19, 2022, 10:13 PM

#

desert oar in the future, i suggest asking sooner! if you don't understand an error message...

yeah, i really thought i could handle it myself

desert oar Oct 19, 2022, 10:13 PM

#

also ax2.plot(x,dbl_ratio,"-",color="orange") you didn't even plot dbl_ratio_avr

#

i think you understand more than you realize, you are just making silly mistakes at this point. maybe fatigue?

fringe anvil Oct 19, 2022, 10:14 PM

#

desert oar i think you understand more than you realize, you are just making silly mistakes...

yeah, full time welder. it's exhausting. usually my code is cleaner during the weekend

#

been at it for 12 years. thats why im trying the bootcamp and maybe get a job. change of career. im all in

desert oar Oct 19, 2022, 10:16 PM

#

you only need to do this once, when you load the dataset:

df2['start date'] = pd.to_datetime(df2['start date'])

and this should produce something like the plot you're looking for:

dbl_ratio = pd.DataFrame(df2["player1 double faults"] / df2["player1 total points total"])
dbl_ratio_avg = dbl_ratio.resample('AS', on='start date').mean()

x = df2["start date"]
y = dbl_ratio

fig2, ax2 = plt.subplots(figsize=(6,4))
fig2.set_facecolor("None")

ax2.scatter(x, y, alpha=0.3)
ax2.plot(x, dbl_ratio_avg, "-", color="C1")

ax2.set_xlabel("Year")
ax2.set_ylabel("Double faults per match")

plt.show()

fringe anvil Oct 19, 2022, 10:16 PM

#

also, i usually come back from work, take a shower and code.. then i realise, like now, that i did not eat yet

desert oar Oct 19, 2022, 10:16 PM

#

go eat and don't look at a computer screen. then go look at the code i just posted above and see if it makes more sense

fringe anvil Oct 19, 2022, 10:18 PM

#

desert oar go eat and don't look at a computer screen. then go look at the code i just post...

yes sir! 🫡

wise iris Oct 19, 2022, 10:40 PM

#

can someone explain me what's pytorch and why do I need it to train YoloV5?

desert oar Oct 19, 2022, 10:41 PM

#

wise iris can someone explain me what's pytorch and why do I need it to train YoloV5?

pytorch is a library that helps you write sophisticated machine learning models. if you need it to train yolov5, that's because the code for yolov5 was written using pytorch.

wise iris Oct 19, 2022, 10:42 PM

#

desert oar pytorch is a library that helps you write sophisticated machine learning models....

oh ok

wise iris Oct 19, 2022, 10:43 PM

#

desert oar pytorch is a library that helps you write sophisticated machine learning models....

also, I have no idea what CUDA i'm supposed to choose

#

or how to find out

desert oar Oct 19, 2022, 10:45 PM

#

wise iris also, I have no idea what CUDA i'm supposed to choose

did you install a cuda toolkit yet?

wise iris Oct 19, 2022, 10:45 PM

#

no lol

desert oar Oct 19, 2022, 10:45 PM

#

do you have a python environment set up?

wise iris Oct 19, 2022, 10:45 PM

#

I was following tutorials and they just brought me here

wise iris Oct 19, 2022, 10:45 PM

#

desert oar do you have a python environment set up?

I created a venv

plush jungle Oct 19, 2022, 10:48 PM

#

I managed to get the stylegan code to not throw any errors by just trying random pretrained models until one worked. But now the code freezes at

Setting up PyTorch plugin "filtered_lrelu_plugin"...```

desert oar Oct 19, 2022, 10:48 PM

#

tbh it might be a little easier to do this with conda, but installing and setting up conda is a bit of a pain

plush jungle Oct 19, 2022, 10:48 PM

#

is there any way to know why it's freezing here?

wise iris Oct 19, 2022, 10:48 PM

#

desert oar tbh it might be a little easier to do this with conda, but installing and settin...

so should I do it?

desert oar Oct 19, 2022, 10:49 PM

#

wise iris so should I do it?

no, don't bother.

activate the venv, then run:

python -m pip install --extra-index-url https://pypi.ngc.nvidia.com nvidia-cuda-runtime-cu11

this will install cuda toolkit, as per https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html#pip-wheels-installation-windows

you should be able to check the exact cuda version that was installed using the nvcc command.

then you should be able to run (with the venv still active):

python -m pip install torch torchvision

and you should have torch available in the venv

Installation Guide Windows :: CUDA Toolkit Documentation

The installation instructions for the CUDA Toolkit on MS-Windows systems.

#

i assume you installed python 3.10 from python.org?

wise iris Oct 19, 2022, 10:51 PM

#

desert oar i assume you installed python 3.10 from python.org?

it was I while ago, I guess I did

wise iris Oct 19, 2022, 10:52 PM

#

desert oar no, don't bother. activate the venv, then run: ```python python -m pip install ...

maybe the nvcc comand is not working?

fringe anvil Oct 19, 2022, 10:53 PM

#

@desert oar hmm i get KeyError: 'The grouper name start date is not found'

desert oar Oct 19, 2022, 10:53 PM

#

wise iris maybe the nvcc comand is not working?

meh, skip it. idk 😆

fringe anvil Oct 19, 2022, 10:54 PM

#

in the documentation for resample, i cant find "AS". where did you get those keyword?

wise iris Oct 19, 2022, 10:54 PM

#

desert oar meh, skip it. idk 😆

ok, so now I don't need to install it through the website?

desert oar Oct 19, 2022, 10:54 PM

#

fringe anvil <@389497659087650836> hmm i get KeyError: 'The grouper name start date is not fo...

oh my mistake. you should probably put dbl_ratio back into the dataframe so you can do this more easily:

df2['dbl_ratio'] = dbl_ratio
dbl_ratio_avg = df2.resample('AS', on='start date')['dbl_ratio'].mean()

desert oar Oct 19, 2022, 10:54 PM

#

wise iris ok, so now I don't need to install it through the website?

i don't think you need to

#

@fringe anvil i think it would be easier to do this using a datetime index, but that's a whole big pandas topic that i think we can hold off on (but you should learn it at some point)

#

oh also, one more thing

#

you don't need pd.DataFrame here:

dbl_ratio = df2["player1 double faults"] / df2["player1 total points total"]

#

the full code:

df2 = ...

df2["start date"] = pd.to_datetime(df2["start date"])

df2["dbl_ratio"] = df2["player1 double faults"] / df2["player1 total points total"]

dbl_ratio_year_avg = df2.resample("AS", on="start date")["dbl_ratio"].mean()

x = df2["start date"]
y = df2["dbl_ratio"]

fig2, ax2 = plt.subplots(figsize=(6,4))
fig2.set_facecolor("None")

ax2.scatter(x, y, alpha=0.3)
ax2.plot(x, dbl_ratio_year_avg, "-", color="C1")

ax2.set_xlabel("Year")
ax2.set_ylabel("Double faults per match")

plt.show()

fringe anvil Oct 19, 2022, 11:02 PM

#

hmm, title dont show on the y and x now, and the style isnt white anymore.. idk if its my computer being janky lol.. i restarted the kernel reran everything. now i get ValueError: x and y must have same first dimension, but have shapes (1179,) and (15,)

#

ok set_xlabel needs to be called before .scatter and .plot

plush jungle Oct 19, 2022, 11:04 PM

#

ok I found out that the reason it's freezing at filter_lrelu_plugin is cause I have two versions of it in my pytorch files

#

how do I know which one to delete?

fringe anvil Oct 19, 2022, 11:05 PM

#

3.9 is last version, but which version does pytorch uses?

novel python Oct 19, 2022, 11:27 PM

#

I just pivoted a table and wanted to get rid of the top row (all DATA_USAGE_GB__C), and bring the months 1 row down so that the 3rd current row becomes the top one

#

wanted to do that with python, not simply moving them on the .csv file

#

anyone got an idea how to do that? got kinda confused trying here

fringe anvil Oct 19, 2022, 11:39 PM

#

@desert oar the resample creates a shape of 15, which doesnt match the shape of x "start date" start date has 1179 row

#

we passed it the whole column, it should have the same rows, both of them

fringe anvil Oct 20, 2022, 1:09 AM

#

new code, new error. getting closer to the shape of x..
im able to generate the same graph with groupby.. not sure if its better or not
ValueError: x and y must have same first dimension, but have shapes (1179,) and (926,)

df2["start date"] = pd.to_datetime(df2["start date"]) # should be good now

fig2, ax2 = plt.subplots(figsize=(6,4)) # good
fig2.set_facecolor("None") # good

plt.style.use("default") # good
ax2.set_xlabel("Year") # good
ax2.set_ylabel("Double faults per match") # good

df2["dbl_ratio"] = (df2["player1 double faults"]/df2["player1 total points total"]) # good
dbl_ratio_avr = df2.groupby(["start date","dbl_ratio"])["dbl_ratio"].mean() # not good


x = df2["start date"] # good
y = df2["dbl_ratio"] # good

ax2.scatter(x, y, alpha=0.3) # good
ax2.plot(x, dbl_ratio_avr, "-", color="C1") # need to change something for y

#

#

152+926=1078 .. so still missing 101 rows .. ah geez this graph.. 3 failed days in a row lol

desert oar Oct 20, 2022, 1:13 AM

#

fringe anvil <@389497659087650836> the resample creates a shape of 15, which doesnt match the...

sorry my mistake again. you'll need to plot using

ax2.plot(dbl_ratio_avr.index, dbl_ratio_avr, "-", color="C1") # need to change something for y

#

or even just

dbl_ratio_avr.plot(ax=ax2, color='C1')

using the pandas built-in plotting helpers

#

(this is a taste of why indexes are useful)

desert oar Oct 20, 2022, 1:14 AM

#

plush jungle ok I found out that the reason it's freezing at filter_lrelu_plugin is cause I h...

i wouldn't just delete stuff by hand. pip uninstall the things you don't need

fringe anvil Oct 20, 2022, 1:17 AM

#

OMG

#

@desert oar ❤️

lapis sequoia Oct 20, 2022, 4:47 AM

#

how can I keep rows in a df, which have/don't have a match in another df while merging

#

anti_join, semi_join kind of thing

fossil sphinx Oct 20, 2022, 5:02 AM

#

Greetings! I have a functional json implementation, for the most part. I am having difficulties with this section:

        puntreturns_t1 = dataGameStats['teams'][0]['stats'][7]['data'].split("-")[0]
        puntreturnsyards_t1 = dataGameStats['teams'][0]['stats'][7]['data'].split("-")[0]

Appropriate JSON code:

       {
           "stat" : "Punt Returns: Number-Yards",
           "data" : "-"
       }

How can I get puntreturns_t1 AND puntreturnsyards_t1 == 0 / None?

I am getting the following error with the current code:

ValueError: invalid literal for int() with base 10: ''

trail yacht Oct 20, 2022, 5:39 AM

#

I need some help. I have to do a project on pneumonia detection using deep learning and machine learning. Its a group project and we just know machine learning basics and a little algo. We don't know any deep learning. We do have the code but don't know how to distribute among 3 people. And also how to quickly learn deep learning.. just need to learn straight from the code... They will teach us later. Any tactics?

lapis sequoia Oct 20, 2022, 6:04 AM

#

lapis sequoia how can I keep rows in a df, which have/don't have a match in another df while m...

df.merge should work.

idle cairn Oct 20, 2022, 12:15 PM

#

Does anyone know why my spline looks like this (blue)? I would expect it to be like the one i drew op top (red)..

serene scaffold Oct 20, 2022, 12:28 PM

#

idle cairn Does anyone know why my spline looks like this (blue)? I would expect it to be l...

Please do not ask people to read screenshots of text.

#

!code

arctic wedgeBOT Oct 20, 2022, 12:28 PM

#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

arctic wedgeBOT Oct 20, 2022, 1:02 PM

#

:incoming_envelope: :ok_hand: applied mute to @earnest raven until <t:1666271518:f> (10 minutes) (reason: newlines rule: sent 106 newlines in 10s).

The <@&831776746206265384> have been alerted for review.

serene scaffold Oct 20, 2022, 1:04 PM

#

!unmute 130213385265610753

arctic wedgeBOT Oct 20, 2022, 1:04 PM

#

:incoming_envelope: :ok_hand: pardoned infraction mute for @earnest raven.

serene scaffold Oct 20, 2022, 1:04 PM

#

@earnest raven use the paste bin

#

!paste

arctic wedgeBOT Oct 20, 2022, 1:04 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.

earnest raven Oct 20, 2022, 1:04 PM

#

Thank you! appreciate it.

serene scaffold Oct 20, 2022, 1:05 PM

#

but I appreciate that you used ```py. sorry you got zapped.

earnest raven Oct 20, 2022, 1:05 PM

#

No problem 🙂

#

I've been trying to make a distribution graph based on a dataset that contains duplicate data in an array like so py [5, 15, 15, 15, 15, 15, 20, 20, 20, 30]
However when I change the X axis to be a linear range instead of the actual values of the array, the graph morphs into something completely different.
This is my code which results in the first graph: https://paste.pythondiscord.com/omojaqoxoz
The second graph was created by using x = np.linspace(min(gatewayLatencyValues), max(gatewayLatencyValues), len(gatewayLatencyValues)), however this completely morphs the graph. It is notable that the boxplot stays correct regardless of what the X axis is, since it is generated by matplotlib and not based on array indices.

Anyone have any idea how to solve this?

#

This is the code for the calculate_normal function: https://paste.pythondiscord.com/iromatirox

mighty patio Oct 20, 2022, 2:15 PM

#

what are you trying to achieve by changing the values on the x-axis?

#

it looks like you trying to plot a histogram of discreet values, perhaps https://numpy.org/doc/stable/reference/generated/numpy.histogram.html will help you?

earnest raven Oct 20, 2022, 2:26 PM

#

mighty patio what are you trying to achieve by changing the values on the x-axis?

Id like to smooth the lines out, but for that I need an X axis that has a lot of steps in order to use cubic interpolation

mighty patio Oct 20, 2022, 2:34 PM

#

earnest raven Id like to smooth the lines out, but for that I need an X axis that has a lot of...

Your calculate_normal function does 2 things.
First it calculates the average and standard deviation
Then it makes the normal
You should separate this into two functions, the first avg, std = get_fit(array) and the second y = make_curve(x, avg, std)
The x you input to the second can have a high density of points, and should not be the same as the array you input to the first. This will give you a smooth curve

earnest raven Oct 20, 2022, 2:35 PM

#

I also forgot to mention another thing, i'd like to add another graph to it with a different dataset using the same x axis

#

but I will keep what you mentioned in mind

#

I tried to just add it to the existing one, but it has more datapoints but less latency so that means the entire x axis has a different scale

mighty patio Oct 20, 2022, 2:40 PM

#

I also advise you to set both dpi and figsize in plt.subplots(). Doing so allows you to control the fontsize regardless of the number of pixels in your graph
A high dpi+low figsize makes the text big while low dpi+high figsize makes the font small

mint palm Oct 20, 2022, 2:40 PM

#

if do

from numba import cuda 
device = cuda.get_current_device()
device.reset()

will it affect other users(using that gpu)?

mighty patio Oct 20, 2022, 2:40 PM

#

mint palm if do ```py from numba import cuda device = cuda.get_current_device() device.r...

it shouldn't

mint palm Oct 20, 2022, 2:41 PM

#

i hope my prof data doesnt get reset

earnest raven Oct 20, 2022, 2:46 PM

#

mighty patio I also advise you to set both `dpi` and `figsize` in `plt.subplots()`. Doing so...

Good shout, looks much better now! Appreciate it.

mint palm Oct 20, 2022, 2:57 PM

#

RuntimeError: CUDA error: out of memory

#

what to do?

serene scaffold Oct 20, 2022, 2:59 PM

#

mint palm RuntimeError: CUDA error: out of memory

in the absence of further information, all we can suggest is to find a GPU with more memory.

#

keep in mind that none of us have any idea what you're doing that resulted in you getting that error unless you tell us.

mint palm Oct 20, 2022, 3:03 PM

#

is this relevant?

$ free -g
              total        used        free      shared  buff/cache   available
Mem:            503         414          40           8          47          75
Swap:           255         255           0

#

seems less than normal to me though

serene scaffold Oct 20, 2022, 3:04 PM

#

mint palm is this relevant? ``` $ free -g total used free ...

I assume you were trying to train a model, or something. your options are to get a GPU with more memory, or see if you can still train your model by using the available memory more efficiently

#

but if the model itself needs more memory than the size of the GPU, I think you're SOL.

mint palm Oct 20, 2022, 3:05 PM

#

serene scaffold I assume you were trying to train a model, or something. your options are to get...

i am trying to extract feature of videos. using resNext model

serene scaffold Oct 20, 2022, 3:05 PM

#

mint palm i am trying to extract feature of videos. using resNext model

how big is the model, and how much memory does your GPU have? please answer using the same unit for both

mint palm Oct 20, 2022, 3:05 PM

#

my video is pretty small, and i am doing it one by one

#

ok one minute

#

#

its this one(the model).
but i dont know about GPU, con i know over ssh?

sharp citrus Oct 20, 2022, 3:31 PM

#

hello guys I'm a very new in here discord and data science. I started to an internship and I have to some forecast with ml is there anybody to help to find some resources ?

mint palm Oct 20, 2022, 3:41 PM

#

serene scaffold how big is the model, and how much memory does your GPU have? please answer usin...

#

| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| 42% 89C P2 282W / 350W | 23647MiB / 24268MiB | 92% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:25:00.0 Off | N/A |
| 47% 86C P2 215W / 350W | 2667MiB / 24268MiB | 100% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:41:00.0 Off | N/A |
| 30% 37C P8 21W / 350W | 14999MiB / 24268MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... Off | 00000000:61:00.0 Off | N/A |
| 30% 34C P8 25W / 350W | 13515MiB / 24268MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 NVIDIA GeForce ... Off | 00000000:81:00.0 Off | N/A |
| 87% 68C P2 302W / 350W | 16301MiB / 24268MiB | 98% Default |
| | | N/A |

#

lapis sequoia Oct 20, 2022, 3:44 PM

#

in sklearn's accuracy_score function, how do I implement sample_weight? https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html
would I use
label, count = np.unique(y_true, return_counts = True)
and call accuracy_score(y_true, y_pred, count)

scikit-learn

sklearn.metrics.accuracy_score

Examples using sklearn.metrics.accuracy_score: Plot classification probability Plot classification probability Multi-class AdaBoosted Decision Trees Multi-class AdaBoosted Decision Trees Probabilis...

plush jungle Oct 20, 2022, 3:55 PM

#

I'm trying to retrain stylegan3
https://github.com/NVlabs/stylegan3
but I keep getting this error:

  File "C:\python\Generative-Adversarial-Networks\stylegan3-main\stylegan3-main\torch_utils\misc.py", line 163, in copy_params_and_buffers
    tensor.copy_(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)
RuntimeError: The size of tensor a (512) must match the size of tensor b (1024) at non-singleton dimension 1```

#

I can't figure out why the tensor shapes would be wrong. I'm running this command

py train.py --outdir=~/training-runs --cfg=stylegan3-t --data=datasets/control_dataset.zip --gpus=1 --batch=32 --gamma=8.2 --mirror=1 --workers=1 --snap=50 --tick=4 --resume=C:\python\Generative-Adversarial-Networks\stylegan3-main\stylegan3-main\pretrained_models\stylegan3-r-ffhqu-256x256.pkl```
which runs a model trained on 256x256 images on my dataset (control_dataset.zip) which is also 256x256 images

fresh tiger Oct 20, 2022, 5:05 PM

#

Hi, I have a question regarding ANNs, in particular what the neurons in the hidden layer represent

In the screenshot, ive just drawn up a quick ANN thats used to predict house prices based on the three features: num. of bedrooms, area, and dist. from closest school.

So the hidden layer consist of two layers of neurons. Looking at the first layer in the hidden layer, each neuron will take as input the same features, but the weights may be different, and hence different features may have more impact in some neurons than others. The neurons then apply an activation function etc and produce an output.

My question is, what exactly is this output? What sort of information is a specific neuron calculating and outputting?

Im assuming this is completely flying over my head, but I can not seem to find a clear/direct answer on this, and would appreciate any help

plush jungle Oct 20, 2022, 5:11 PM

#

every hidden layer neuron is computing a score (the activation of that neuron) based on all the ones in the previous layer

#

I like to think of it like each hidden layer neuron is an olympic judge, and the input neurons are a diver

#

each input neuron is a quality that the diver had: gracefulness, amount of splash, difficulty of dive

#

and the hidden layer judges each value those qualities differently

#

the second hidden layer is like another panel of judges

#

only instead of judging the diver, they judge the olympic judges

#

and they too have preferences, so maybe one really hates the russian judge but really likes the swedish judge etc.

#

I don't know if this is making any sense, but TLDR; the first hidden layer finds patterns in the input data. the second hidden layer finds patterns in the patterns

wooden sail Oct 20, 2022, 5:17 PM

#

fresh tiger Hi, I have a question regarding ANNs, in particular what the neurons in the hidd...

drawing networks like that is always kinda deceptive imo. you can think of the circular nodes you drew as being entries of a single vector. the lines joining the nodes are matrices performing linear or affine transformations on those vectors

serene scaffold Oct 20, 2022, 5:30 PM

#

wooden sail drawing networks like that is always kinda deceptive imo. you can think of the c...

imo, those "connected bipartite subgraphs" visualizations are only intelligible if you already understand how neural networks work. which means that they have no communication power.

wooden sail Oct 20, 2022, 5:30 PM

#

yeah that'd be my take as well

plush jungle Oct 20, 2022, 5:39 PM

#

serene scaffold imo, those "connected bipartite subgraphs" visualizations are only intelligible ...

agreed, but it sure is hard to draw a bunch of vectors in a diagram like that. 3blue1brown displays them as vectors in his videos but they're animated which makes it easier

wooden sail Oct 20, 2022, 5:41 PM

#

you don't need to (and actually can't) draw them geometrycally. you could just use thin rectangles and fat rectangles (and cubes/prisms/etc when dealing with multidimensional stuff and/or tensors)

serene scaffold Oct 20, 2022, 5:41 PM

#

plush jungle agreed, but it sure is hard to draw a bunch of vectors in a diagram like that. ...

I actually thought of 3b1b right after I said that. you're right that they're more communicative when they're animated.

plush jungle Oct 20, 2022, 5:42 PM

#

#

if you look, he never actually draws the whole net as vectors

#

this is literally just one neuron

plush jungle Oct 20, 2022, 5:42 PM

#

wooden sail you don't need to (and actually can't) draw them geometrycally. you could just u...

do you have an example of someone drawing it like that? it sounds cool

wooden sail Oct 20, 2022, 5:43 PM

#

well it's basically what you drew there just now, just removing the annotations of the elements of each object

#

but lemme fish something up

#

like what you see here for the convolutional parts

#

there's no reason you can't do the same for a dense network

fresh tiger Oct 20, 2022, 5:53 PM

#

plush jungle I don't know if this is making any sense, but TLDR; the first hidden layer finds...

Alright this is starting to make more sense, especially with the analogy

#

There just one thing im still a bit unclear on

#

what would these patterns consist of?

wooden sail Oct 20, 2022, 5:55 PM

#

#

my artistic interpretation. regarding the patterns, that depends entirely on what you're training the network to do, but in general they are not human-interpretable. most deep learning architectures are not interpretable

serene scaffold Oct 20, 2022, 5:57 PM

#

So artistic

fresh tiger Oct 20, 2022, 5:57 PM

#

Oh, so all we know is that it builds upon some sort of pattern?

#

And so via training the model, we set the weights so that at each layer our neurons find the pattern that lead to the best/most accurate output?

wooden sail Oct 20, 2022, 5:58 PM

#

pretty much. that's why many people dislike it

#

it's hard to derive strict guarantees for its performance, but so far it anyway works better than most classical methods

fresh tiger Oct 20, 2022, 6:01 PM

#

Ok ok I see, so just to summarize:

The different set of weights for each neuron will essentially lead to a different pattern being detected by each neuron. The outputs of all neurons when reaching the end in a way "combine"/each neuron contributes to affecting how the overall model will look at the end, and hence we can get models that can fit to any kind of data (ie models with many squiggly lines when graphed)?

#

OH so like if neuron 1 for example had weights that emphasized num of bedrooms and area

#

there could be a pattern in terms of num of bedrooms and area of house having a particular affect on the output right?

wooden sail Oct 20, 2022, 6:03 PM

#

sure

fresh tiger Oct 20, 2022, 6:04 PM

#

Alright, I think its making sense to me now. Thank you all very much for ur help 🙂

wooden sail Oct 20, 2022, 6:07 PM

#

i prefer looking at it from the perspective of parameter estimation. you assume a model and find the model parameters that best explain the data

#

the deep learning model is "ayyy lmao idk what the model is, but this thing has so many parameters it can't go wrong"

fresh tiger Oct 20, 2022, 6:09 PM

#

Right I see, so like in this image for example, the neurons which are connected with a purple line may have higher weights/parameters when connecting to the dog output neuron. and the neurons connected with the green lines may have higher weights for the cat neuron and lower wegiths for the dog neuron

#

So our model learned via estimating the parameters which neurons have more emphasis on determining if we have a dog or a cat.

wooden sail Oct 20, 2022, 6:12 PM

#

well, but what you're calling a "neuron" here are just entries of intermediate (or final) vectors

#

the only reason those matter are because you yourself chose which one represents dog and which one represents cat

#

but yeah that's more or less the idea

#

the caveat being that the stuff going into that layer already has no interpretation

fresh tiger Oct 20, 2022, 6:14 PM

#

wooden sail well, but what you're calling a "neuron" here are just entries of intermediate (...

just to confirm, is this referring to the red circles?

wooden sail Oct 20, 2022, 6:14 PM

#

mhm

fresh tiger Oct 20, 2022, 6:14 PM

#

Sorry, I was referring to the neurons before those 2

wooden sail Oct 20, 2022, 6:14 PM

#

the idea is basically the same

#

since you're applying an affine transformation, it's two vectors related via a matrix

#

you're finding the entries of that matrix, which correspond to the weights, as you call them

fresh tiger Oct 20, 2022, 6:26 PM

#

Ahaa ok yes. Thank you so much for all of your help! I really appreciate it 🙂

strong sedge Oct 20, 2022, 6:29 PM

#

I have been working on my own neural network implementation using numpy
https://github.com/sivansh11/sklearn-nn-extension
try it out! I feel like there is probably a bug some where in the code lmao
I want this to be an extension to sklearn's neural network capabilities, ie work with all the infrastructure that sklearn has built

GitHub

GitHub - sivansh11/sklearn-nn-extension: A separate extension to sk...

A separate extension to sklearn for adding modular neural networks which in theory should be able to work with sklearn's infrastructure. - GitHub - sivansh11/sklearn-nn-extension: A separa...

desert oar Oct 20, 2022, 7:00 PM

#

strong sedge I have been working on my own neural network implementation using numpy https:/...

nice little project. admittedly i don't think i or many other people would use this when something like skorch is available:

https://towardsdatascience.com/skorch-pytorch-models-trained-with-a-scikit-learn-wrapper-62b9a154623e

https://skorch.readthedocs.io/en/latest/?badge=latest

but it seems like a good self study project!

Medium

SKORCH: PyTorch Models Trained with a Scikit-Learn Wrapper

A guide to understand how easy and simple it is to train PyTorch models with SKORCH

strong sedge Oct 20, 2022, 7:03 PM

#

desert oar nice little project. admittedly i don't think i or many other people would use t...

ohh, I had no idea this existed,
fair enough,

I did start this project while trying to understand gradient decent and other optimizers (Adam etc). so not all is wasted :), I learnt something new

#

the only thing I dont understand is how/where to implement l1 and l2 regularisation

lapis sequoia Oct 20, 2022, 9:05 PM

#

Hello, I have looked everywhere for the answer to this. I am using keras / tensorflow and creating a model

    history = model.fit(
  File "C:\Python310\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Python310\lib\site-packages\tensorflow\python\eager\execute.py", line 54, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Input is empty.
         [[{{node decode_image/DecodeImage}}]]
         [[IteratorGetNext]] [Op:__inference_test_function_7901]
2022-10-20 22:04:21.910095: W tensorflow/core/kernels/data/cache_dataset_ops.cc:856] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.```

#

The error is within the package, tried reinstalling, doesnt wanna work

wary crown Oct 20, 2022, 9:27 PM

#

so I am trying to splt a csv into X and y
here is my code:

# Python version
import sys

from sklearn.metrics import make_scorer

print('Python: {}'.format(sys.version))
# scipy
import scipy

print('scipy: {}'.format(scipy.__version__))
# numpy
import numpy

print('numpy: {}'.format(numpy.__version__))
# matplotlib
import matplotlib

print('matplotlib: {}'.format(matplotlib.__version__))
# pandas
import pandas

print('pandas: {}'.format(pandas.__version__))
# scikit-learn
import sklearn

print('sklearn: {}'.format(sklearn.__version__))

# compare algorithms
from pandas import read_csv
from matplotlib import pyplot
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import StratifiedKFold
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC
from sklearn.feature_selection import RFE

# Load dataset
url = "energy.csv"
#url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"
names = ['YEAR', 'TOTAL', 'PURCHASED', 'NUCLEAR', 'SOLAR', 'WIND', 'NATURAL_GAS', 'COAL', 'OIL']
dataset = read_csv(url, names=names)
print(dataset.shape)

# Split-out validation dataset
array = dataset.values
X = array[:, 0:8]
y = array[:, 8]


print(y)

when I print y in the last line tho
I get this:

[  19.9948    0.        0.        0.        0.        0.        0.
    0.        0.      260.2    1326.9          nan       nan       nan
       nan       nan  723.18   2070.    ]

which I dont believe is supposed to happen (the 'nan' thing)
can anyone who knows this kind of stuff tell me whats wrong because im not really sure
thanks in advance

lapis sequoia Oct 20, 2022, 10:48 PM

#

lapis sequoia Hello, I have looked everywhere for the answer to this. I am using keras / tenso...

One of my images was corrupt I believe, ignore me

fringe anvil Oct 21, 2022, 12:25 AM

#

lets say i made a nice scatterplot that uses the whole dataframe. and i want to generate smaller scatterplots, but limit the dataframe to a specific entry in one of the column. so column "species" has 6 different birds. how do i generate those similar but slightly different subset of my main scatterplot? idk if i make any sense

#

so here's what im working with and it made my original scatterplot, but 6 times. now i just need 6 different scatterplots with just the data of a specific entry of my column "surface". which has 6 entries.

num_rows, num_cols = 3,2
fig3, ax3 = plt.subplots(num_rows,num_cols,figsize=(10,12))
fig3.set_facecolor("None") # good

plt.style.use("default") # good

for i in range(num_rows):
    for j in range(num_cols):
        ax3[i,j].scatter(x,y,alpha=0.3)
        ax3[i,j].plot(dbl_ratio_year_avg.index, dbl_ratio_year_avg, "-", color="C1")
        ax3[2,0].set_xlabel("Year")
        ax3[2,1].set_xlabel("Year")
        ax3[0,0].set_ylabel("Double faults per match")
        ax3[1,0].set_ylabel("Double faults per match")
        ax3[2,0].set_ylabel("Double faults per match")

desert oar Oct 21, 2022, 12:28 AM

#

fringe anvil so here's what im working with and it made my original scatterplot, but 6 times....

you just need to filter x and y in the loop. these are called "small multiples" plots, fyi.

fringe anvil Oct 21, 2022, 12:29 AM

#

desert oar you just need to filter `x` and `y` in the loop. these are called "small multipl...

thats the name! and good evening to you! thanks for taking my questions again

desert oar Oct 21, 2022, 12:29 AM

#

and maybe do some clever indexing as well, but that's not strictly necessary

#

let's say that you want to split according to a series or array called categ

#

the only tricky bit here is figuring out which element in the axes array corresponds to which category

fringe anvil Oct 21, 2022, 12:33 AM

#

this is what ive found. fifth column of my dataframe

desert oar Oct 21, 2022, 12:33 AM

#

there are a couple different ways to do it actually

#

@fringe anvil
you can use some clever indexing for this:

df2["dbl_ratio"] = (df2["player1 double faults"] / df2["player1 total points total"])
surfaces = df2['surface'].unique().to_list()

num_rows, num_cols = 3,2
fig3, axs3 = plt.subplots(
    num_rows, num_cols,
    figsize=(10,12),
    sharex=True, sharey=True,
)

for k, surface in enumerate(surfaces):
    df_surface = df2.loc[df2['surface'] == surface]
    dbl_ratio_year_avg = df_surface.resample('AS', on='start date')["dbl_ratio"].mean()
    i, j = np.unravel_index(k, (num_rows, num_cols))
    a = axs3[i, j]
    a.scatter(df_surface['start date'], df_surface['dbl_ratio'], alpha=0.3)
    a.plot(dbl_ratio_year_avg.index, dbl_ratio_year_avg, color="C1")

ax3[2,0].set_xlabel("Year")
ax3[2,1].set_xlabel("Year")
ax3[0,0].set_ylabel("Double faults per match")
ax3[1,0].set_ylabel("Double faults per match")
ax3[2,0].set_ylabel("Double faults per match")

fig.tight_layout()
plt.show()

#

you can also do this a bit more elegantly with pandas groupby, but this is good enough to start with

#

np.unravel_index is worth understanding. think of a ~~4x3~~ 3x3 array:

a00  a01  a02
a10  a11  a12
a20  a21  a22

now imagine "walking" through this array by going across each row. when you get to the end of the row, jump down to the beginning of the next row, like a typewriter:

-->---->---->-|
a00  a01  a02 |
|--------------
-->---->---->-|
a10  a11  a12 |
|--------------
-->---->---->-|
a20  a21  a22

idk if my hilariously bad illustration helps

#

what's the array index of the 6th step (k = 5 with zero-indexing) along that walk? it's 1, 2.

imagine if were to flatten out the array, connecting rows end-on-end, to produce a 1-d array. then flatten(a)[5] == a[1, 2]

#

!eval numpy calls this "ravel" (a pun on "unravel", like yarn or thread):

import numpy as np

a = np.arange(9).reshape((3, 3))
assert a.ravel()[5] == a[1, 2]

arctic wedgeBOT Oct 21, 2022, 12:46 AM

#

@desert oar :warning: Your 3.11 eval job has completed with return code 0.

[No output]

desert oar Oct 21, 2022, 12:47 AM

#

and you can convert between these "flat" ("raveled") indexes and the "non-flat" ("unraveled") array indexes with np.unravel_index and np.ravel_multi_index

#

so either of these would work

    i, j = np.unravel_index(k, (num_rows, num_cols))
    a = axs3[i, j]

    a = axs3.ravel()[k]

#

@fringe anvil does that make any sense at all?

#

this is actually how numpy arrays are stored internally: as one big flat array. all the multi-dimensional axis stuff is an illusion, produced by looping over the array contents internally

fringe anvil Oct 21, 2022, 12:51 AM

#

sorry im back. almost forgot to load my winter tires in the car for tomorrow lol

fringe anvil Oct 21, 2022, 12:52 AM

#

desert oar `np.unravel_index` is worth understanding. think of a ~~4x3~~ 3x3 array: ``` a00...

that looks 3x3 to my untrained eye

desert oar Oct 21, 2022, 12:52 AM

#

fringe anvil that looks 3x3 to my untrained eye

it looks like 3x3 to my trained eye as well 😆

fringe anvil Oct 21, 2022, 12:54 AM

#

desert oar <@206985846740615168> does that make any sense at all?

yeah it does, you iterate over k,v with enumerate. then use .loc to assign the iteration of the surfaces type to itself then assign that to the figure

desert oar Oct 21, 2022, 12:56 AM

#

fringe anvil yeah it does, you iterate over k,v with enumerate. then use .loc to assign the i...

right. but do you understand my explanation about the array indexing business?

fringe anvil Oct 21, 2022, 12:58 AM

#

desert oar right. but do you understand my explanation about the array indexing business?

yeah looks like it's doing the job of a double for loop, for i in stuff for j in stuff.. so iterate through the elements in first row, then go to second row and do the same etc

#

thats what unravel_index does from what i can see

desert oar Oct 21, 2022, 12:58 AM

#

fringe anvil yeah looks like it's doing the job of a double for loop, for i in stuff for j in...

yeah, it's not looping, but it's converting the k "flat" index into i, j array indexes, at each step of the loop

fringe anvil Oct 21, 2022, 1:00 AM

#

desert oar yeah, it's not _looping_, but it's converting the `k` "flat" index into `i, j` a...

ah yeah, thats what i just saw in your few next lines. im trying to read while out of breath, brain isnt following apparently

#

so we are enumerating on surfaces, is that df2["surface"]