queen cradle Apr 2, 2023, 4:12 PM

#

Then you can do the same thing I suggested earlier, but once for each year within the cycle. So for a cycle of length 5, estimate the frequency of appearance using data from 2001, 2006, 2011, etc.; then the frequency of appearance using 2002, 2007, 2012, etc.; then 2003, 2008, 2013, etc.; ending with an estimate of the frequency of appearance based on whether there was an appearance in 2005, 2010, 2015, 2020.

#

For a 5-long cycle, that gives you five estimates. The one you want for 2023 would be the estimate based on 2003, 2008, 2013, and 2018.

#

The method I suggested earlier, where you look at the fraction of appearances for all the years, is actually the same as this one, but for a cycle of length 1.

#

This is not the only thing you can do. If the years within a cycle are completely independent of each other, then it's the maximum likelihood estimate, and you'll be hard-pressed to do any better. But in the example we discussed earlier—two years off, epsilon% chance of appearance, two years on—successive years seem to be correlated with each other.

#

You can exploit that as follows: Make fraction of appearances estimates for each year in the cycle. Say that the cycle has length C and these fractions are x1, ..., xC. These fractions are periodic data. Assuming successive values should be correlated, you should get better results if you smooth this data.

#

There are a variety of ways to do this. You could convolve with a kernel. You could fit a Fourier series and discard high-frequency terms. There is stuff about this in the time series literature.

#

Okay, I have to get going. I hope this helps!

queen cradle Apr 2, 2023, 5:02 PM

#

Oh, I remembered one other thing. You might get slightly better estimates of the fraction of appearances if you add one to the numbers of appearances and non-appearances. This is called Laplace’s Rule of Succession and is a kind of Bayesian technique.

lapis sequoia Apr 2, 2023, 5:09 PM

#

@queen cradle can I have a question?

simple lantern Apr 2, 2023, 7:38 PM

#

Hi there, if anyone is familiar with PCA, could you see if this plot looks correct in terms of dimensionality reduction. For me it looks like I can reduce the dimensions to 18

explained_variance_ratio = np.array([0.15204826, 0.12169249, 0.07493663, 0.06553111, 0.06038744,
                                     0.05784934, 0.05351325, 0.04958794, 0.04762545, 0.04716847,
                                     0.04626846, 0.04332389, 0.04200419, 0.03658627, 0.03129437,
                                     0.03022434, 0.02267887, 0.01051982, 0.00414997, 0.00241604,
                                     0.0001934 ])

# Calculate the cumulative explained variance
cumulative_explained_variance = np.cumsum(explained_variance_ratio)

# Plot the elbow plot
plt.figure(figsize=(10, 6))
plt.plot(range(1, len(explained_variance_ratio) + 1), explained_variance_ratio, marker='o', label='Individual Explained Variance')
plt.plot(range(1, len(cumulative_explained_variance) + 1), cumulative_explained_variance, marker='s', linestyle='--', label='Cumulative Explained Variance')
plt.xlabel('Number of Principal Components')
plt.ylabel('Explained Variance Ratio')
plt.title('Elbow Plot')
plt.legend()
plt.grid()
plt.show()

violet gull Apr 2, 2023, 8:18 PM

#

how are pytorch conv2d biases and weights seeded?

queen cradle Apr 2, 2023, 8:48 PM

#

lapis sequoia <@710929945526009897> can I have a question?

(Sorry, I was busy. Back now for a little bit.) Go ahead.

lapis sequoia Apr 2, 2023, 8:49 PM

#

queen cradle (Sorry, I was busy. Back now for a little bit.) Go ahead.

I found it, but thanks ❤️

queen cradle Apr 2, 2023, 8:49 PM

#

No problem!

cerulean mortar Apr 2, 2023, 9:14 PM

#

Heya maybe I've not googled enough but I'm struggling to find a way to create a grid of subplots with an outer and inner set of axis labels

#

this is what I mean basically

#

I'm wondering if it's possible using pyplot or seaborn?

#

oh also the outer labels would be for categorical variables which i guess is obvious

candid garnet Apr 2, 2023, 9:18 PM

#

if i have a tensor of shape (x,y,z,16) and i want to reshape it to (x,y,z,4,4) is there a tensorflow command to do that? any solution i've seen means x,y,z have to be known values but it's subject to change in my case, I just want to square the last axis

stone pine Apr 2, 2023, 9:46 PM

#

Anyone using synthetic data?
https://www.youtube.com/watch?v=ep0PhwsFx0A

YouTube

YData

Synthetic data generation with Streamlit app

Synthetic data is artificially generated data that is not collected from real-world events. It replicates the statistical components of real data containing no identifiable information, ensuring an individual’s privacy.

We have released a new ydata-synthetic version that includes a Streamlit app to ease your process of synthetic data generation...

▶ Play video

boreal gale Apr 2, 2023, 9:51 PM

#

cerulean mortar this is what I mean basically

have you came across the term inset before?

might want to check this out as a demo if not: https://matplotlib.org/stable/gallery/axes_grid1/inset_locator_demo.html

it's a bit manual but probably can do what you wanted

stone glacier Apr 3, 2023, 7:40 AM

#

Hey all, it me again

#

does anyone have any good alternatives (open-sourced) for Tableau?

untold flicker Apr 3, 2023, 8:03 AM

#

Does Nbeats time series model have its own documentation. I'm trying to find it

stone glacier Apr 3, 2023, 8:17 AM

#

https://pytorch-forecasting.readthedocs.io/en/stable/api/pytorch_forecasting.models.nbeats.NBeats.html??

#

maybe also this: https://unit8co.github.io/darts/generated_api/darts.models.forecasting.nbeats.html?

untold flicker Apr 3, 2023, 10:34 AM

#

thank you

light steppe Apr 3, 2023, 1:28 PM

#

hello all, what is the best machine learning model to use and implement an Alzheimer prediction model? the model must be able to handle custom inputs and have a prediction percentage as its output. we're building an mvp at the moment if that context matters. im torn between svm and random forests

also, best IDEs for machine learning and training models? i'm not convinced by VSC

(ping for reply)

dawn mortar Apr 3, 2023, 1:45 PM

#

can someone tell me how I can access each of the inner elements in my tensorflow prediction [[5.0252132e-37 8.4258248e-16 2.4297525e-25 1.6483234e-02 1.2865312e-22
9.8351675e-01 4.9724836e-31 2.6893213e-29 6.2175032e-10 5.4309886e-12]]

#

these arent seperated by a comma this is realy confusing

wooden sail Apr 3, 2023, 1:46 PM

#

you can index as usual

#

try prediction[0][index_you_want_to_see]

#

or just comma separated too

dawn mortar Apr 3, 2023, 1:52 PM

#

omg thank you edd

thorny drum Apr 3, 2023, 1:58 PM

#

When in databricks, I wanted to open a csv file directly

#

Does anyone know how to get this to work?

s3_boto = boto3.client('s3')
obj1 = s3_boto.get_object(Bucket=config_bucket, Key=path)
data = obj1['Body'].read().decode('latin1')

#

After this, I can write the file but I cannot open any contents I pull in the correct format

cerulean mortar Apr 3, 2023, 3:53 PM

#

boreal gale have you came across the term `inset` before? might want to check this out as a...

thank you for the suggestion! i've decided to include the relevant information in the titles of each of my subplots for now but will keep this in mind for the future

hoary prism Apr 3, 2023, 4:28 PM

#

hi, im supposed to see anaconda 3/4 in my jupter folder but its not there. how can i fix this?

serene scaffold Apr 3, 2023, 4:30 PM

#

hoary prism hi, im supposed to see anaconda 3/4 in my jupter folder but its not there. how c...

what do you mean by "jupyter folder"?

hoary prism Apr 3, 2023, 4:30 PM

#

sorry, i meant in the jupyter notebook folders

serene scaffold Apr 3, 2023, 4:31 PM

#

Alright. I don't use or like anaconda, so I'll let someone else take it from here.

hoary prism Apr 3, 2023, 4:31 PM

#

👌

wooden sail Apr 3, 2023, 4:35 PM

#

hoary prism hi, im supposed to see anaconda 3/4 in my jupter folder but its not there. how c...

do you want to use anaconda as your interpreter?

hoary prism Apr 3, 2023, 4:36 PM

#

wooden sail do you want to use anaconda as your interpreter?

sorry i dont know what that means im pretty new, i want to install tensor-gpu

raw compass Apr 3, 2023, 4:36 PM

#

what server do you guys recommend so I can train my models there?

wooden sail Apr 3, 2023, 4:37 PM

#

hoary prism sorry i dont know what that means im pretty new, i want to install tensor-gpu

ok. you have installed anaconda already, yes?

wooden sail Apr 3, 2023, 4:37 PM

#

raw compass what server do you guys recommend so I can train my models there?

google colab is good and free

hoary prism Apr 3, 2023, 4:38 PM

#

wooden sail ok. you have installed anaconda already, yes?

yes

wooden sail Apr 3, 2023, 4:39 PM

#

hoary prism yes

ok. that's really all you need. anaconda installs its stuff in some directory in your computer, named anaconda3, but you never have to interact with that folder yourself

raw compass Apr 3, 2023, 4:39 PM

#

wooden sail google colab is good and free

can you recommend me a good documentation that I can follow?

hoary prism Apr 3, 2023, 4:40 PM

#

wooden sail ok. that's really all you need. anaconda installs its stuff in some directory in...

how?

#

I see that i have tensorflow but not tensorflow-gpu

wooden sail Apr 3, 2023, 4:42 PM

#

right, tensorflow gpu has to be installed separately. lemme see if i can find a good explanation on how to install it, because that also requires some nvidia drivers

#

huh turns out it's kind of annoying in windows haha

#

https://neptune.ai/blog/installing-tensorflow-2-gpu-guide here is one guide on how to do it

neptune.ai

Installing TensorFlow 2 GPU [Step-by-Step Guide] - neptune.ai

Tensorflow is one of the most-used deep-learning frameworks. It’s arguably the most popular machine learning platform on the web, with a broad range of users from those just starting out, to people looking for an edge in their careers and businesses. Not all users know that you can install the TensorFlow GPU if your hardware…

wooden sail Apr 3, 2023, 4:44 PM

#

raw compass can you recommend me a good documentation that I can follow?

their own is good. it should be fairly intuitive though: you basically run a jupyter notebook on their hardware. you can read about it here https://colab.research.google.com/

Google Colaboratory

rotund cove Apr 3, 2023, 6:11 PM

#

hey! there's my kakuro solver
i thought that code is "well" written but i'm not happy with the output so i'm looking for help what i can improve or change

arctic wedgeBOT Apr 3, 2023, 6:12 PM

#

Hey @rotund cove!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

rotund cove Apr 3, 2023, 6:13 PM

#

https://paste.pythondiscord.com/ufajameteb

#

the output looks like this:

Fitness value of the best solution = 97
Number of generations passed is 5000
Best solution found is:
['*', [0, 17], [0, 6], '*', '*']
[[11, 0], 1, 1, [0, 24], '*']
[[17, 0], 1, 1, 1, [0, 3]]
['*', [11, 0], 9, 9, 9]
['*', '*', [11, 0], 1, 9]

Correct solution is:
[['*', [0, 17], [0, 6], '*', '*'], [[11, 0], 8, 3, [0, 24], '*'], [[17, 0], 9, 1, 7, [0, 3]], ['*', [11, 0], 2, 8, 1], ['*', '*', [11, 0], 9, 2]]```

violet gull Apr 3, 2023, 6:44 PM

#

how are pytorch conv2d biases and weights seeded?

wooden sail Apr 3, 2023, 6:46 PM

#

wdym by seeded

violet gull Apr 3, 2023, 6:46 PM

#

like initialized

#

when they are first birfed

wooden sail Apr 3, 2023, 6:46 PM

#

that's probably in the docs, isn't it?

violet gull Apr 3, 2023, 6:46 PM

#

reading is hard

#

i cant find

wooden sail Apr 3, 2023, 6:47 PM

#

https://github.com/pytorch/pytorch/blob/08891b0a4e08e2c642deac2042a02238a4d34c67/torch/nn/modules/conv.py#L40-L47

arctic wedgeBOT Apr 3, 2023, 6:47 PM

#

torch/nn/modules/conv.py lines 40 to 47

def reset_parameters(self):
    n = self.in_channels
    for k in self.kernel_size:
        n *= k
    stdv = 1. / math.sqrt(n)
    self.weight.data.uniform_(-stdv, stdv)
    if self.bias is not None:
        self.bias.data.uniform_(-stdv, stdv)```

wooden sail Apr 3, 2023, 6:47 PM

#

first google result 😛

violet gull Apr 3, 2023, 6:47 PM

#

cool bot feature

#

so its a uniform distribution with a stanfard deviation of 1/sqrt(n)

#

and n is input channels

#

pog

wooden sail Apr 3, 2023, 6:48 PM

#

no

#

you ignored the for loop

violet gull Apr 3, 2023, 6:49 PM

#

ok

#

kernal size is the total size or the size of one dimension

#

2x2 = 4 or just 2

wooden sail Apr 3, 2023, 6:50 PM

#

you'll have to test this in your code and see, i don't know. but since it appears to return an iterable, it seems to be all sizes

violet gull Apr 3, 2023, 6:50 PM

#

so 4

wooden sail Apr 3, 2023, 6:50 PM

#

test and see to make sure

violet gull Apr 3, 2023, 6:50 PM

#

but we are using 3d kernals arent we

#

so 8

wooden sail Apr 3, 2023, 6:51 PM

#

so test and see to make sure

violet gull Apr 3, 2023, 6:51 PM

#

ok ty Edd ❤️

raw compass Apr 3, 2023, 7:24 PM

#

wooden sail their own is good. it should be fairly intuitive though: you basically run a jup...

so it can be only jupyter?

wooden sail Apr 3, 2023, 7:26 PM

#

raw compass so it can be only jupyter?

on colab, yes

#

you can write all your code in regular .py files and then just import them and call them from the notebook

raw compass Apr 3, 2023, 8:05 PM

#

wooden sail on colab, yes

okay thank you

drifting spear Apr 3, 2023, 8:07 PM

#

is there anyone that can help me with my code i am trying to make it so that the loads audio tracks in to a KNN and then uses inmput from a microphone in order to test it

wooden sail Apr 3, 2023, 8:08 PM

#

what's troubling you about it?

drifting spear Apr 3, 2023, 8:09 PM

#

i dont know if its actually training correctly and i tried to get it to stop with 'p' but it just prints the same thing over and over doesnt help that im new and have been trying to do it based on what little ive managed to find out

wooden sail Apr 3, 2023, 8:10 PM

#

a good way to check that is to print the loss at every epoch

drifting spear Apr 3, 2023, 8:11 PM

#

i know im going to sound dumb but how do i do that also whats that mean exactly

wooden sail Apr 3, 2023, 8:18 PM

#

oh sorry i misread that as cnn haha i thought you were using a neural network

#

so you're doing vector quantization of audio

#

the best way of testing this would be to play back the encoded audio. looking at the numbers won't tell you all that much. mostly because of psychoacoustics (how sound is perceived vs what it actually is)

#

you can plot the original audio vs the encoded audio, but in general the best test is to play it back and listen to it

drifting spear Apr 3, 2023, 8:24 PM

#

sorry i dont know if i explained it properly but what i need to happen is for the mic to pic up sound a compare it to the training data and play a sound depending on the class prediction

hasty mountain Apr 3, 2023, 8:25 PM

#

Doing that without a neural network is a bit strange pithink

#

Oh, I get it now. Do you want the sound to be generated? Or will it be a sample from your training data?

#

If it should be generated, then I really don't know how to do that with KNN...

wooden sail Apr 3, 2023, 8:26 PM

#

huh that's very different from what i thought you were doing. in that case i also find it weird to do it without a neural network. what are the vectors you're using for knn?

drifting spear Apr 3, 2023, 8:30 PM

#

hasty mountain Oh, I get it now. Do you want the sound to be generated? Or will it be a sample ...

its just a test to see if this system would work on weather on not it can correctly plot the sound of the mic

#

and as im new to this i have to make it simple or id be completly lost

hasty mountain Apr 3, 2023, 8:32 PM

#

Uh... Dealing with sound, per se, can't be simple

#

Unless there's situations where you can disregard mel-spectrograms, fourier-transform, etc. pithink

wooden sail Apr 3, 2023, 8:38 PM

#

yes, if the sound is supposed to represent audio

#

that's the psychoacoustics part

sleek harbor Apr 3, 2023, 8:46 PM

#

Got a question about Ordinary Least Squares & Non-Negative Least Squares.
What if you have a bunch of parameters, and some of them can be negative, some can't.. in scikit learn u can do positive=True to make all coefficients non-negative, but how do you make it so that only some select coefficients will be forced non-negative (for the always positive parameters), and the rest can be whatever?

wooden sail Apr 3, 2023, 8:54 PM

#

hasty mountain Unless there's situations where you can disregard mel-spectrograms, fourier-tran...

the reason you use a mel transform is because the ear perceives the spectrum in a warped scale, not linearly. the frequency axis is compressed logarithmically, and a specific amount of attenuation is attributed to each frequency. similarly, the ear is more sensitive to some frequencies than to others, so there is a reference mask that predicts how well you'll be able to hear distortions at different frequencies. these are all things that only matter specifically because of how humans hear sound, there's nothing special about it otherwise

hasty mountain Apr 3, 2023, 8:54 PM

#

wooden sail the reason you use a mel transform is because the ear perceives the spectrum in ...

But then...shouldn't it be used in this case, too?

wooden sail Apr 3, 2023, 8:55 PM

#

there are other weird effects like intensity and temporal masking. loud, short duration sounds make it impossible to hear other sounds after, but also before them

#

oh yeah, that was a typo lol

#

idk why i typed no. that's my default response to everything, i guess

hasty mountain Apr 3, 2023, 8:55 PM

#

lol

#

But then... I guess that, unless you're making an algorithm for biology purposes or detection of wave sounds that can't be heard by humans, mel-spectrograms should be used by default, right?

wooden sail Apr 3, 2023, 8:57 PM

#

sleek harbor Got a question about Ordinary Least Squares & Non-Negative Least Squares. What i...

you could try constrained optimization instead

wooden sail Apr 3, 2023, 8:57 PM

#

hasty mountain But then... I guess that, unless you're making an algorithm for biology purposes...

well, most sound is inaudible to us

#

i work a lot with ultrasound

#

very low frequency sound is also just perceived as vibration. this includes most of seismology

hasty mountain Apr 3, 2023, 9:09 PM

#

wooden sail i work a lot with ultrasound

Ultrasound, eh?
Do you use spectrograms directly? Energy spectrograms?

wooden sail Apr 3, 2023, 9:09 PM

#

every now and then. i usually use parametric methods instead

hasty mountain Apr 3, 2023, 9:11 PM

#

https://www.ultrasoundresearchgroup.com/research/parametric-sound/

#

pithink

#

Interesting... Can I use that for ultrasound images in medical exams?

wooden sail Apr 3, 2023, 9:11 PM

#

yeah

hasty mountain Apr 3, 2023, 9:12 PM

#

And, is that how the ultrasound images are generated, by chance?

wooden sail Apr 3, 2023, 9:12 PM

#

hasty mountain https://www.ultrasoundresearchgroup.com/research/parametric-sound/

that's not quite what i meant by parametric methods btw. i meant more generally, parameter estimation

hasty mountain Apr 3, 2023, 9:13 PM

#

I see... Guess I'll have to take a look at that.
Have a bit ambitious project involving ultrasound sensors, but in micro/nano scale

signal robin Apr 3, 2023, 9:31 PM

#

I am creating a dataset for speech processing, I have a json file that I want to modify by adding 199 to every value in the index

#

#

basically i need to change the 1 in the first position as well as in audio path to be 200 and 2 into 201 and so on

sleek harbor Apr 3, 2023, 9:47 PM

#

wooden sail you could try constrained optimization instead

No idea what that is.. thanks for the reply, I'll look into it. U've gotta be some kind of genius. I see u here all the time, u've got an answer to all questions, and most importantly - u got a pixel cat pfp 💯

hasty mountain Apr 3, 2023, 9:51 PM

#

||Edd = Math oracle brainmon ||

brazen sphinx Apr 4, 2023, 3:09 AM

#

Hi everyone. I need help here with a plotly scatterplot. The units for CO2 flux on the y axis are wrong. It says µ, from 0-~160µ. But in my table it is in g / m^2 / s (around the 0.000x mark). I'm not sure what is happening here. Any suggestions?
Also, I want the y axis to read as CO_2 flux g^-2 s^-1. How do I write this?

cold osprey Apr 4, 2023, 4:17 AM

#

brazen sphinx Hi everyone. I need help here with a plotly scatterplot. The units for CO2 flux ...

The u is micro. There isn't any units there rn

brazen sphinx Apr 4, 2023, 5:01 AM

#

cold osprey The u is micro. There isn't any units there rn

Why aren't the data displaying the correct numbers on the y axis then?

cold osprey Apr 4, 2023, 5:06 AM

#

brazen sphinx Why aren't the data displaying the correct numbers on the y axis then?

0.0005 = 500µ . Does this make sense for ur data values?

brazen sphinx Apr 4, 2023, 5:07 AM

#

cold osprey 0.0005 = 500µ . Does this make sense for ur data values?

Hmmm. Yep. I think it does thanks. Do you know how to fix this?

cold osprey Apr 4, 2023, 5:07 AM

#

brazen sphinx Hmmm. Yep. I think it does thanks. Do you know how to fix this?

not sure. proly gotta read the docs

#

same for adding in the units u want it to display

#

alternatively u can put the units in the title

#

CO2 (g / m^2 /s )

brazen sphinx Apr 4, 2023, 5:08 AM

#

cold osprey CO2 (g / m^2 /s )

Awesome. Thanks. How do I write that line in python?

cold osprey Apr 4, 2023, 5:09 AM

#

no idea haha google fam

#

https://plotly.com/python/reference/layout/yaxis/

#

docs r ur friend

#

and stackoverflow

brazen sphinx Apr 4, 2023, 5:54 AM

#

thanks for your help homie

dense fractal Apr 4, 2023, 7:26 AM

#

<ipython-input-39-19626fc93a6d> in <cell line: 2>()
      1 from mlxtend.plotting import plot_decision_regions
----> 2 plot_decision_regions(x_train,y_train.values, clf=clf , legend = 2)

1 frames
/usr/local/lib/python3.9/dist-packages/matplotlib/axes/_base.py in axis(self, arg, emit, **kwargs)
   2125             self.set_ylim(ymin, ymax, emit=emit, auto=yauto)
   2126         if kwargs:
-> 2127             raise _api.kwarg_error("axis", kwargs)
   2128         return (*self.get_xlim(), *self.get_ylim())
   2129 

TypeError: axis() got an unexpected keyword argument 'y_min' ```

#

this is my error

#

plot_decision_regions(x_train,y_train.values, clf=clf , legend = 2) ```

#

this is my code

#

I don't what is the error here

cold osprey Apr 4, 2023, 7:39 AM

#

dense fractal ``` TypeError Traceback (most recent call last) ...

just guessing here but maybe a package version mismatch?

dense fractal Apr 4, 2023, 7:40 AM

#

cold osprey just guessing here but maybe a package version mismatch?

how to resolve that I'm new here

cold osprey Apr 4, 2023, 8:39 AM

#

dense fractal how to resolve that I'm new here

https://github.com/rasbt/mlxtend/issues/735

GitHub

TypeError: axis() got an unexpected keyword argument 'y_min' · Iss...

Hi I am having the following issue since conda package update TypeError: axis() got an unexpected keyword argument 'y_min' Priviously I could able to run and see my results.. Even I tried w...

#

its just a google away bruh

#

literally first search result

dense fractal Apr 4, 2023, 8:40 AM

#

@cold osprey it is my first project so I don't know anything ☺️

solar yew Apr 4, 2023, 9:26 AM

#

[NLP]

Hi guys, thought it wouldn't be appropriate to spam my entire question here. But if anyone has any insights into applying BERT models on large bodies of text I would really appreciate it

https://discord.com/channels/267624335836053506/1092740527495061585

#

Perhaps I could also try finetuning it in relation to Central Bank communication? or apply a weighting to the sentiment with the specific target words in the text section

Really open to suggestions!

ruby venture Apr 4, 2023, 12:45 PM

#

Not sure if this is the right chat for this, but I'm a total newbie to python and coding. For my medical physics internship project I have been given a task surrounding the processing large 2D array data files in .xcc/XML format and making a bunch of graphs to help display results and tolerances. I have some code from a colleague to help me along but I'm really out of my depth. If someone in here is a wiz in this area I would super appreciate some help as I don't have an awful lot of time.

serene scaffold Apr 4, 2023, 12:49 PM

#

ruby venture Not sure if this is the right chat for this, but I'm a total newbie to python an...

I don't have an awful lot of time.
if you essentially need someone to do it for you, you probably won't get help here. that said, you have to give all the information needed for someone to start helping, or no one will try to help.

ruby venture Apr 4, 2023, 12:52 PM

#

Ahh I see. I don't need someone to do it for me, I think what I need is someone to review what tasks I have to perform, and tell me where I can find accurate resources for what I need to achieve, or even better, teach me how to complete the tasks. I'm happy to give more information, though it is quite a lot to type out, hence why I thought maybe it would be easier for someone to screen share with me and review the task and code I have for themselves.

candid garnet Apr 4, 2023, 2:16 PM

#

I'm currently working with a pretty huge tensor multiplication, that I had working in numpy for a smaller version. There's complex numbers and item assignment is a pretty handy way to do some operations (like array[..., 0, 0] = 123), so started to look in to GPU acceleration.

I 'm using an M1 Mac, which pytorch supports, but apparently doesn't like working with complex numbers yet (which is a core part of what i'm doing) "TypeError: Trying to convert ComplexDouble to the MPS backend but it does not have support for that dtype."

I then looked at tensorflow, but it doesn't support item assignment (I guess I could make it work, but it would be a fair bit more complicated).

I also don't think JAX supports mac m1 gpu yet but i'm not sure.

Pytorch would be the most ideal if anyone knows a way of getting the gpu to play nicely with complex numbers.

Any ideas?

wooden sail Apr 4, 2023, 2:23 PM

#

in fairness, pytorch also doesn't exactly support assignment. if your operations depend on the values of a tensor and you modify any of its entries and then use the tensor again, you'll get an error at some point down the line because the state of the tensor changed unexpectedly. to get to your actual question, i was under the impression pytorch did support differentitiation with complex parameters using wirtinger calculus https://pytorch.org/docs/stable/notes/autograd.html#complex-autograd-doc but if not, you can always split your function into real and imaginary parts (has to be done carefully)

blazing ember Apr 4, 2023, 2:57 PM

#

Is there a better library to run ANOVAs other than scipy?

raw compass Apr 4, 2023, 3:03 PM

#

how does a model training work exactly? I mean what's not clear is that how can I improve and optimize the model by running a python file, if that is gonna execute after run-time? So basically if it is not chaining anything "CONSTANT", then what is the point of doing that?

untold cliff Apr 4, 2023, 3:06 PM

#

I just found out that the correlation coefficient measures relationship only and doesnt imply causation at all, so does this mean that including vatiables with good correlation with the target doesnt mean that they would actually be good for the model?

serene scaffold Apr 4, 2023, 3:11 PM

#

raw compass how does a model training work exactly? I mean what's not clear is that how can ...

what kind of model?

raw compass Apr 4, 2023, 3:12 PM

#

serene scaffold what kind of model?

just a basic model.

serene scaffold Apr 4, 2023, 3:12 PM

#

raw compass just a basic model.

there's no "basic model" about which one can make general statements that apply to all model training.

raw compass Apr 4, 2023, 3:13 PM

#

serene scaffold there's no "basic model" about which one can make general statements that apply ...

I'm more curious about the "how", generally.

serene scaffold Apr 4, 2023, 3:15 PM

#

raw compass I'm more curious about the "how", generally.

have you seen 3blue1brown's series about neural networks?

#

(there are lots of models that aren't neural networks, so statements about "basic neural networks" do not apply to all models.)

raw compass Apr 4, 2023, 3:16 PM

#

serene scaffold (there are lots of models that aren't neural networks, so statements about "basi...

but how does the training work, exactly.

serene scaffold Apr 4, 2023, 3:16 PM

#

raw compass but how does the training work, exactly.

try watching that video series.

violet gull Apr 4, 2023, 3:47 PM

#

arctic wedge `torch/nn/modules/conv.py` lines 40 to 47 ```py def reset_parameters(self): ...

My page looks completely different

#

That code does not exist on the main branch and the branch that claims to be on also does not exist

#

I can’t find it anywhere

subtle mural Apr 4, 2023, 4:20 PM

#

Hi peeps,

Not sure if this is the right area to ask but is anyone able to give me their opinions on which of the following setups are better?I'm planning on getting a new desktop.
I'm mainly looking to train and run both Vision and LLM(E.g llama 7b or 13b 8int) models locally

CPU
Intel Core i9 13900KF | 24 Cores 32 Threads
COOLING
AFTERSHOCK Glacier Mirror 360mm Watercooling
MOTHERBOARD
Gigabyte Z790 Aorus Elite AX D5
GPU
Gigabyte RTX 4090 Gaming OC 24GB
RAM
32GB ADATA Lancer RGB DDR5 6000MHz (16x2)
SSD
2TB Lexar NM710 Gen4 SSD
PSU
1000W FSP Hydro GT Pro 80+ Gold

VS

CPU
AMD Ryzen 9 7950X Processor
COOLING
AFTERSHOCK Glacier Mirror 360mm Watercooling
MOTHERBOARD
Gigabyte B650 Gaming X AX
GPU
Gigabyte RTX 4090 Gaming OC 24GB
RAM
32GB ADATA Lancer RGB DDR5 6000MHz (16x2)
SSD
2TB Lexar NM710 GEN4 SSD
PSU
1200W FSP Hydro Ptm Pro 1200W 80+ Platinum

serene scaffold Apr 4, 2023, 4:25 PM

#

subtle mural Hi peeps, Not sure if this is the right area to ask but is anyone able to give ...

the main concern for model training is going to be GPU size. It looks like both involve the 4090 at 24GB, so you'll want to confirm that the LLMs you want to work with take up less space than that.

#

AMD Ryzen 9 7950X Processor what's the core and thread count for this one?

subtle mural Apr 4, 2023, 4:52 PM

#

serene scaffold `AMD Ryzen 9 7950X Processor` what's the core and thread count for this one?

Heya, thanks for taking the time to reply 🙂

The processor is 16 core, 32 threads

serene scaffold Apr 4, 2023, 4:53 PM

#

subtle mural Heya, thanks for taking the time to reply 🙂 The processor is 16 core, 32 threa...

looks like the first build is generally better, though since you have the same GPU and RAM in both builds, I'm not sure how much it will impact your training.

subtle mural Apr 4, 2023, 4:54 PM

#

serene scaffold looks like the first build is generally better, though since you have the same G...

Thanks, your opinions helps!
I'm guessing that the intel build is better because of the higher core count?

#

im thinking that the only way my training will be affected will be in CPU bound tasks e.g data pipeline processing, augmentation etc🤔

serene scaffold Apr 4, 2023, 4:56 PM

#

subtle mural Thanks, your opinions helps! I'm guessing that the intel build is better because...

right. though if you're training on the GPU, then you're not using all those extra cores.

im thinking that the only way my training will be affected will be in CPU bound tasks e.g data pipeline processing, augmentation etc
how much of that can you actually do in parallel?

#

and even if you could, how much time would it actually save you?

mild dirge Apr 4, 2023, 4:57 PM

#

Most of the time 2 workers is enough for loading in the data and augmenting

subtle mural Apr 4, 2023, 5:06 PM

#

serene scaffold right. though if you're training on the GPU, then you're not using all those ext...

those are some good points, i gotta think about this more😅

subtle mural Apr 4, 2023, 5:07 PM

#

mild dirge Most of the time 2 workers is enough for loading in the data and augmenting

i didn't know that o.0 i usually just put as many workers as i have cores 🤣

violet gull Apr 4, 2023, 5:20 PM

#

Edd

serene scaffold Apr 4, 2023, 5:21 PM

#

violet gull Edd

why speaketh thou the blessed name?

violet gull Apr 4, 2023, 5:25 PM

#

serene scaffold why speaketh thou the blessed name?

Edd my savior

#

He smartest person on the internet

#

And Edd knows PyTorch

serene scaffold Apr 4, 2023, 5:27 PM

#

you might as well just ask your question as if Edd were here. Maybe he'll answer it later. Maybe someone else can answer it.

sleek harbor Apr 4, 2023, 5:29 PM

#

Do y'all ever use residual plots such as sns.residplot()? Or are they more of a "cool, that's possible, never gonna use it tho" kind of thing? Can I get some use case examples?

violet gull Apr 4, 2023, 5:39 PM

#

Why does PyTorch claim kernals are 3x3 but Edd say it does a convolution with a 3D kernal

#

When I look at how weights are initialized it uses self.kernal_size

#

When I did conv2D(999, 998, 997), self.kernal_size is 2 because the kernal is 2d

#

Also the GitHub page Edd referenced is non existent

slender terrace Apr 4, 2023, 5:48 PM

#

I have this idea for a project. TLDR of it is to train how a player plays a game by looking at replay files and the corresponding level and generate a replay file from a new level. But there's so many issues:

How do I link a replay and level?
Do I have to split up the replay into it's keyframes or the whole replay at once?
How does the encoder encode an entire file?

And so many more. I have no idea where to begin to start. If someone can help me find some resources or what to do to get started, I'd appreciate it. I've been looking and I haven't found anything that's remotely to do with how to do this :(

subtle mural Apr 4, 2023, 5:48 PM

#

violet gull Why does PyTorch claim kernals are 3x3 but Edd say it does a convolution with a ...

im guessing by 3 x 3 kernels, you're referring to pytorch's nn.Conv2D?

violet gull Apr 4, 2023, 5:48 PM

#

subtle mural im guessing by 3 x 3 kernels, you're referring to pytorch's nn.Conv2D?

Yes

violet gull Apr 4, 2023, 5:49 PM

#

slender terrace I have this idea for a project. TLDR of it is to train how a player plays a game...

I assume you would train it the same way you train cars to drive from video frames

subtle mural Apr 4, 2023, 5:49 PM

#

what's the context behind the convolution with a 3D kernel?

violet gull Apr 4, 2023, 5:50 PM

#

#data-science-and-ml message

#

This conversation

violet gull Apr 4, 2023, 5:52 PM

#

slender terrace I have this idea for a project. TLDR of it is to train how a player plays a game...

Or depending on the game you can try reinforcement learning

#

Using openCV to read the screen

slender terrace Apr 4, 2023, 5:54 PM

#

The whole point of the project is to play with a player's playstyle, and replays are really the only way to get that information

violet gull Apr 4, 2023, 5:54 PM

#

What game

slender terrace Apr 4, 2023, 5:56 PM

#

trackmania

wooden sail Apr 4, 2023, 5:57 PM

#

violet gull Why does PyTorch claim kernals are 3x3 but Edd say it does a convolution with a ...

seems at some point they replaced the xavier init with kaiming init https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/conv.py it's also a uniform distribution, just with a different range

GitHub

pytorch/conv.py at master · pytorch/pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration - pytorch/conv.py at master · pytorch/pytorch

slender terrace Apr 4, 2023, 5:57 PM

#

the replays have the player inputs and positions of the car because they have some weird validation thing that doesnt work

subtle mural Apr 4, 2023, 6:01 PM

#

violet gull Why does PyTorch claim kernals are 3x3 but Edd say it does a convolution with a ...

Conv2d doesn't have a default 3 x 3 kernel? Do you have a link to pytorch saying that kernals are 3 x 3? From the convo you linked, im guessing that edd was referring to the 2D kernel of size (2 x 2) with the group setting being set to the default of 1 as per https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html ?

violet gull Apr 4, 2023, 6:06 PM

#

subtle mural Conv2d doesn't have a default 3 x 3 kernel? Do you have a link to pytorch saying...

By 3x3 I meant nxm meanings 2 dimensional not 3

verbal venture Apr 4, 2023, 6:06 PM

#

is there any advice for combing datasets for a multilabel classification cnn

fiery jungle Apr 4, 2023, 6:41 PM

#

hi ,

def plot_the_model(trained_weight, trained_bias, feature, label):
  """Plot the trained model against the training feature and label."""

  # Label the axes.
  plt.xlabel("feature")
  plt.ylabel("label")

  # Plot the feature values vs. label values.
  plt.scatter(feature, label)

  # Create a red line representing the model. The red line starts
  # at coordinates (x0, y0) and ends at coordinates (x1, y1).
  x0 = 0
  y0 = trained_bias
  x1 = feature[-1] /// WHAT IS THIS ??????????????????????????????????????????????????????????
  y1 = trained_bias + (trained_weight * x1)
  plt.plot([x0, x1], [y0, y1], c='r')

  # Render the scatter plot and the red line.
  plt.show()

could someone tell me what is feature[-1] ?

#

im trying to learn TF and im fairly new to machine learning

#

how could feature order be in negative ?

wooden sail Apr 4, 2023, 6:42 PM

#

at a glance, it looks line you're fitting a straight line

fiery jungle Apr 4, 2023, 6:42 PM

#

wooden sail at a glance, it looks line you're fitting a straight line

correct

wooden sail Apr 4, 2023, 6:42 PM

#

this would draw a line segment joining the first and last points

agile cobalt Apr 4, 2023, 6:42 PM

#

broadly speaking, list[-1] / array[-1] in python = list[len(list) - 1] / array[len(array) - 1]

fiery jungle Apr 4, 2023, 6:43 PM

#

agile cobalt broadly speaking, list[-1] / array[-1] in python = list[len(list) - 1] / array[l...

ohh now i see

#

thanks a lot

wooden sail Apr 4, 2023, 6:43 PM

#

feature is your vector of input values, the x values. feature[-1] is "the last element of the feature vector", hence why it "joins the first and last points" on the line

fiery jungle Apr 4, 2023, 6:44 PM

#

wooden sail feature is your vector of input values, the x values. feature[-1] is "the last e...

that make sense , thanks a lot ❤️

pastel verge Apr 4, 2023, 6:56 PM

#

guys, i'm building an application on streamlit. I have a dataframe, and I made an aggrid table based on it. Then, a I made a groupby in the filtered data from this aggrid table. Now, I want to make a new aggrid table, based on this group by, and this group by updates every time I filter the original aggrid table. The groupby update I can do, it's done, but the aggrid table of this groubpy is not working

#

as I update the groupby data, this new aggrid table should update as well

cerulean lantern Apr 4, 2023, 7:50 PM

#

Somebody has use yolov8 for measure object?

sleek harbor Apr 4, 2023, 10:04 PM

#

sklearns PolynomialFeatures vs numpys polyfit.. is there any point in using numpys version when sklearn does the same thing? Like, if I'm more comfortable with sklearn, can I just stick with it, or are there drawbacks?

agile cobalt Apr 4, 2023, 10:15 PM

#

use sklearn's

#

numpy's method might be preferred if you want to do complicated statistics and/or want full control over what your code and model are doing, but if you are more focused on the end result / the predictions than the details of the method itself, just use sklearn

verbal venture Apr 4, 2023, 11:06 PM

#

this CNN tutorial I'm following made a model and initialized it with model(4). He passed an int as the parameter - is that normal?

agile cobalt Apr 4, 2023, 11:15 PM

#

depends - that "model" is which function or class exactly?

hoary prism Apr 4, 2023, 11:36 PM

#

img = cv2.imread('1902539.jpg')
plt.imshow(img)
plt.show()```
   whats the problem with my code?

#

w/o the py

serene scaffold Apr 5, 2023, 12:08 AM

#

hoary prism ```py img = cv2.imread('1902539.jpg') plt.imshow(img) plt.show()``` whats th...

what happened to tell you that it's wrong?

#

unless it's an obvious syntax error, it's usually not possible to just look at code and know immediately what's wrong with it.

#

or if it is, it's unnecessarily difficult, as compared to figuring out what's wrong with it when you have some information about its intention.

fiery jungle Apr 5, 2023, 12:32 AM

#

hi,
I have 2 questions on this

 model.add(tf.keras.layers.Dense(units=1, 
                                  input_shape=(1,)))

what does units =1 , input_shape=(1,) means ? and why do we use input_shape=(1,) and not input_shape=(1) ?

#

im completely new to ML so forgive me if it was a noob question 😄

#

nvm chat gpt explained it for me LOL

#

what would the humans do after the ai takes over ? 😄

#

chatGpt: We use input_shape=(1,) instead of input_shape=(1) because the input shape must always be a tuple, even if it only contains one value.

untold flicker Apr 5, 2023, 1:14 AM

#

I have normalised my data by subtracting by the mean and dividing by the standard deviation but a lot of my features still have a pretty big range. They don't all fall between -1 and 1. They also don't all have a mean centred at 0 will this be a problem for my neural network

fiery jungle Apr 5, 2023, 1:38 AM

#

untold flicker I have normalised my data by subtracting by the mean and dividing by the standar...

Regarding your concern about the range of some features not falling between -1 and 1, and the mean not being centered at 0, it's important to remember that normalization is just one step in preprocessing your data. While it can help, it's not always necessary for all neural networks.

If you find that your neural network is not performing as well as you'd like, you could try different techniques for normalization, such as min-max scaling or using feature scaling methods like logarithmic scaling. You could also try other preprocessing techniques like feature engineering or dimensionality reduction to see if they improve the performance of your model.

Overall, it's important to test and experiment with different preprocessing techniques to find the best approach for your specific use case

chrome parrot Apr 5, 2023, 2:00 AM

#

a bit of recursion humor written and voiced by AI
https://youtu.be/pIhR1CFGYVQ

YouTube

North Spark Defense Laboratory

North Spark's Recursive Innovation Excursion, 4k HD

*auto subtitles not accurate. Void where prohibited.

Join the Slack discussion at https://join.slack.com/t/northsparkdef-kpr3977/shared_invite/zt-1rc4hgg8y-SschYAlEzVASY1pNmj~7aQ

Our list of links - https://msha.ke/north

https://www.linkedin.com/company/north-spark-defense-laboratory/

https://facebook.com/profile.php?id=100087708991362&sk=a...

▶ Play video

sharp jewel Apr 5, 2023, 3:12 AM

#

can someone help to make code visualization of data

clever summit Apr 5, 2023, 3:46 AM

#

I need help
OpenCV error

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_11504\1904821197.py in <module>
     88     #print(outputs[2].shape)
     89 
---> 90     count=findObjects(outputs,img)
     91     inccount+=count
     92     countReset=int(time.time()-startTime)

~\AppData\Local\Temp\ipykernel_11504\1904821197.py in findObjects(outputs, img)
     65         #print(classNames)
     66         #print(int(confs[i]*100))
---> 67         cv2.line(0,(int(img.shape[0]/2)+3,int(img.shape[1]),int(img.shape[0]/2)-3),(0,0,100),1)
     68         cv2.putText(img,f'{classNames[classIds[i]].upper()} {int(confs[i]*100)}%',(x,y-10),cv2.FONT_HERSHEY_SIMPLEX,0.6,(255,255,0),2)
     69         return count

error: OpenCV(4.7.0) :-1: error: (-5:Bad argument) in function 'line'
> Overload resolution failed:
>  - Can't parse 'pt1'. Expected sequence length 2, got 3
>  - Can't parse 'pt1'. Expected sequence length 2, got 3```
Here's the code: https://paste.pythondiscord.com/lutebijuru

arctic wedgeBOT Apr 5, 2023, 6:00 AM

#

Hey @arctic moss!

It looks like you tried to attach file type(s) that we do not allow (.heic). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

arctic moss Apr 5, 2023, 6:00 AM

#

does this mean my model is overfitting? My validation is orange and training is grey

Screenshot_2023-04-04_at_11.00.16_PM.png

winter siren Apr 5, 2023, 6:15 AM

#

what algorithm I should use if I want to extract key points from some data kinda like in these screenshots.?

#

Screen_Shot_2023-04-05_at_4.11.55_pm.png

Screen_Shot_2023-04-05_at_4.15.01_pm.png

wooden sail Apr 5, 2023, 6:17 AM

#

by key points you mean critical points? where the slope is 0?

winter siren Apr 5, 2023, 6:17 AM

#

yeah

wooden sail Apr 5, 2023, 6:19 AM

#

i'm not sure if there's a built-in function for that. one thing you can do is use a finite difference scheme and mark points where the gradient changes sign

#

then interpolate between those points

#

https://numpy.org/doc/stable/reference/generated/numpy.gradient.html this can do the finite difference part for you. alternatively you could use cubic splines from the get-go and differentiate those

#

ah wait i remembered the name

#

https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks.html

winter siren Apr 5, 2023, 6:22 AM

#

yeah, im a 3d animator working on mocap data . in the software im importing the data to (shown in th above screenshots), the interpolation isdone automatically so i don't neeed to worry about that. more just picking out the points where there's the most difference

#

thanks a bunch

wooden sail Apr 5, 2023, 6:23 AM

#

hmm maybe i misunderstood the question, given what you just said

#

you want to compare the two curves to each other?

winter siren Apr 5, 2023, 6:25 AM

#

wooden sail you want to compare the two curves to each other?

nope Just wanted to find the peaks and trough points.

wooden sail Apr 5, 2023, 6:26 AM

#

ok. then yeah, peaks 😛

quartz thicket Apr 5, 2023, 8:37 AM

#

Is this the best channel to ask a question about pyomo?

#

if not, I put my question here: #1093093447302402069 message

hasty mountain Apr 5, 2023, 9:51 AM

#

Sigh
After diving deep into studying GANs...time to dive deep into Transformer... Looks like its simplicity ends up when the implementation ends and training begins.

#

But I shall do this while my prototype of Text GAN is training brainmon

gleaming gyro Apr 5, 2023, 11:56 AM

#

I was trying to use Latex within matplotlib but I am getting this error
`! LaTeX Error: File 'type1ec.sty' not found.

Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: sty)

Enter file name:
! Emergency stop.
<read *>

l.8 \usepackage
[utf8]{inputenc}`

#

i searched and all the results are for linux. i am on windows 10

wooden sail Apr 5, 2023, 12:01 PM

#

what were you trying to do?

thorny canyon Apr 5, 2023, 12:55 PM

#

Guys somebody know can I get predictios if I use Recbole and Bert4rec?

untold cliff Apr 5, 2023, 12:58 PM

#

Is it wrong to impute missing values using simple startegies like replacing with the mean for example even if it would give better results than imputing using knn for example because even though it preserves the mean of the non-missing values it biases the variance and covariance towards 0 and so we might get opstimistic results ?

queen cradle Apr 5, 2023, 1:19 PM

#

untold cliff Is it wrong to impute missing values using simple startegies like replacing with...

Suppose I give you data with two populations: There are 1000 samples with values clustered very tightly around (0, 0) and 99 samples with values clustered very tightly around (1, 1). There is one sample of the form (1.01, ?), where the ? is missing data. What do you think the ? should be?

#

Another situation. Suppose I give you a time series. It has a slow, periodic up-and-down cycle. On one of the up cycles, it goes up, and up, and keeps going up, and suddenly there's missing data. After a few missing data points, we observe some very high data values, and the values go down, and the same cycle as before is restored. What values do you think should be imputed?

untold cliff Apr 5, 2023, 1:27 PM

#

In my opinion not the mean but the most likely value which depends on its closest neighbors i guess

queen cradle Apr 5, 2023, 1:31 PM

#

Statistically, you're using nearest neighbors as a density approximation: Basically, you're saying that there's a density function (which has some shape you understand from the samples where you have data), and you're looking at the conditional density with respect to the data that you have (for the sample where you're trying to impute missing data). There are lots of ways of constructing density estimates, but nearest neighbors is a good one.

#

Replacing by the mean ignores all the information you actually have, so it's usually a poor choice.

#

And in the time series example I gave, you have information even though there are times where you have no data. This is because successive values of the time series are correlated. (The situation I actually had in mind is that the data comes from a sensor that reports "error" when the reading is too high.)

untold cliff Apr 5, 2023, 1:37 PM

#

Actually now that i think about it it might be good to replace with the mean in case of a variable with low variability maybe ?

#

So its ok to replace with the mean in the 1st example ?

queen cradle Apr 5, 2023, 1:45 PM

#

The first example is a bunch of data clustered around (0, 0) and a bunch of data clustered around (1, 1). The mean (and also the median) in the second coordinate will be around 0 because the subpopulation around (0, 0) has more members. If you have something whose first coordinate is near 1, and you guess that the second coordinate is near 0, then you're predicting the existence of something near (1, 0), where you have literally zero data and so zero reason to believe that your imputed data should be. You should impute a second coordinate near 1.

fringe ermine Apr 5, 2023, 1:46 PM

#

Looking for professional ai developers I need help with a simple program made using ai if that sounds like you please dm

mild dirge Apr 5, 2023, 1:53 PM

#

Ask your question here @fringe ermine

untold cliff Apr 5, 2023, 1:54 PM

#

queen cradle The first example is a bunch of data clustered around (0, 0) and a bunch of data...

I see, that's makes sense. Are there any cases where imputing with the mean (or median, most frequent ...) is more appropriate? (I believe these are called hoc methods right ?)

fiery jungle Apr 5, 2023, 2:17 PM

#

hi ,

I have a question about this pic

Im not familiar with this type of left hand side , i dont know how its processed and how it's ok to do (x1,y1) , (x2,y2) = someFunction() , how is this ok ??? how is this not raising any errors ?

#

chatgpt answered 😄


The syntax x, y = someFunction() is called "multiple assignment" in Python. It's a shorthand way of assigning multiple variables at once, based on the values returned by a function or another iterable object.

When you use multiple assignment, Python automatically unpacks the values returned by the function or iterable object and assigns them to the variables on the left-hand side. For example, if someFunction() returns a tuple of two values (1, 2), then x, y = someFunction() will assign the value 1 to x and the value 2 to y.

As long as the number of variables on the left-hand side matches the number of values returned by the function or iterable object, and the types of the variables match the types of the values, then the assignment will work without raising any errors.

Overall, multiple assignment is a convenient feature of Python that can make code shorter and easier to read. But it's important to be careful when using it, especially with complex functions or iterable objects, to avoid unexpected behavior or errors.

mild dirge Apr 5, 2023, 2:22 PM

#

This is how to unpack a tuple of tuples into 4 separate variables

#

!e

var = ((1, 2), (3, 4))
(a, b), (c, d) = var
print(a, b, c, d)

arctic wedgeBOT Apr 5, 2023, 2:22 PM

#

@mild dirge :white_check_mark: Your 3.11 eval job has completed with return code 0.

1 2 3 4

mild dirge Apr 5, 2023, 2:22 PM

#

Doing it without brackets gives this

#

!e

var = ((1, 2), (3, 4))
a, b, c, d = var
print(a, b, c, d)

arctic wedgeBOT Apr 5, 2023, 2:23 PM

#

@mild dirge :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "/home/main.py", line 2, in <module>
003 |     a, b, c, d = var
004 |     ^^^^^^^^^^
005 | ValueError: not enough values to unpack (expected 4, got 2)

mild dirge Apr 5, 2023, 2:23 PM

#

Ah, thought you were confused about the gpt answer, nvm

fiery jungle Apr 5, 2023, 2:26 PM

#

mild dirge Ah, thought you were confused about the gpt answer, nvm

thanks for explaning it further but omg im fasinated by how chatgpt unpacked my trashy question and what is also amazing is how explained it in a very detailed answer 😄

mild dirge Apr 5, 2023, 2:26 PM

#

Yeah, sometimes..

fiery jungle Apr 5, 2023, 2:27 PM

#

mild dirge Yeah, sometimes..

sometimes ? u had bad exp with it ?

mild dirge Apr 5, 2023, 2:28 PM

#

It's just not reliable. It will tell you some incorrect stuff with full confidence. It's great for inspiration, or getting information that you can then verify yourself. But not just to ask for facts 😛

#

But for these kinda questions it's pretty nice indeed

fiery jungle Apr 5, 2023, 2:30 PM

#

mild dirge It's just not reliable. It will tell you some incorrect stuff with full confiden...

oh yeah i agree but the way i see it, that it has reached the moon with a lighting speed, yes it hasnt reached every inch of the universe yet but looks like it wont be a problem in a near future if it's kept growing that way 😄

#

elon musk is right

quartz thicket Apr 5, 2023, 2:42 PM

#

I'm trying to use scipy's curve_fit. But I'm having trouble getting it to accept my mapping function because I think it should be a fractional logarithm, and

#

import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from math import log

def bestGuess(x, a, b, c):
    comp = a * log(x, 1/b) + c
    return comp

plt.scatter(Msol, M_v)
curve_fit(bestGuess, Msol, M_v)

plt.legend()
plt.show()

#

I don't think numpy.log let's be define the base, so I resorted to math.log, but I dunno if that's the issue.

#

right now the error I'm getting is TypeError: only size-1 arrays can be converted to Python scalars for the line in my bestGuess function

raw compass Apr 5, 2023, 2:55 PM

#

what is the best way to use huge data-sets, I want to work with the "openwebtext" data-set but I don't want to download 12GB directly.

mild dirge Apr 5, 2023, 3:19 PM

#

I think it would definitely be simplest to just download the dataset to your local machine

#

12 GB is not a lot, most machines have 16+ GB ram and a few terrabytes of storage

untold cliff Apr 5, 2023, 3:25 PM

#

quartz thicket I'm trying to use scipy's curve_fit. But I'm having trouble getting it to accept...

I guess x is an array, and i dont think math.log takes arrays as arguments but scalers only so this might be the issue

quartz thicket Apr 5, 2023, 3:28 PM

#

untold cliff I guess x is an array, and i dont think math.log takes arrays as arguments but s...

Yeah, I think you're right. I'm having trouble understanding though how to give it what its looking for. I changed the expression, and think I'm getting closer. But (as far as I can tell) it looks like I'm giving it lists when it wants lists.

#

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

Msol = [59, 48, ... 0.075]
M_v = [-5.8, -5.5, ... 20.5, 20.9]

def bestGuess(x, a, b, c):
    comp = a * np.exp(-b * x) + c
    return comp

popt, pcov = curve_fit(bestGuess, Msol, M_v)
plt.scatter(Msol, M_v)
plt.plot(Msol, bestGuess(Msol, *popt))
plt.xlabel('Msol')
plt.ylabel('Absolute Magnitude')
plt.show()

#

Now I'm getting: TypeError: can't multiply sequence by non-int of type 'numpy.float64' on the comp= line.

#

I understand that I'm passing Msol to it, which is a list, and not a float.

violet gull Apr 5, 2023, 3:31 PM

#

I successfully recreated PyTorch conv layer and the output is identical, that is one less point of failure in my custom nn

quartz thicket Apr 5, 2023, 3:32 PM

#

But I don't know how else to write the line to call it. It seems like I'm doing it correctly, in comparison to the tutorials I'm reading

untold cliff Apr 5, 2023, 3:34 PM

#

quartz thicket I understand that I'm passing Msol to it, which is a list, and not a float.

Yeah you have to convert it to a numpy array in order to be able to perform multiplication like that. When you multiply a normal python list by an integer you're essetially repeating the list, like [1, 2] * 2 gives [1,2,1,2]

#

Though its a float in your case not an int and float isnt a valid operand type for this operation

sleek harbor Apr 5, 2023, 3:35 PM

#

(scikit learn)
When preprocessing, you're supposed to use fit_transform on training data and just transform on testing data, right? To prevent data leakage. But does that rule apply to PolynomialFeatures? I mean.. I don't really see what could possibly be leaked here.. So can I apply PolynomialFeatures before splitting a dataset into training&test data, or would that lead to some problems?

quartz thicket Apr 5, 2023, 3:36 PM

#

untold cliff Though its a float in your case not an int and float isnt a valid operand type f...

Ugh. I was afraid of that. It seems impossible to find a way to use a logarithm in a fitting function. 😭

untold cliff Apr 5, 2023, 3:36 PM

#

quartz thicket Ugh. I was afraid of that. It seems impossible to find a way to use a logarithm ...

No just add a line to convert x to a numpy array and the operation -b * x would work

#

x = np.array(x)

mild dirge Apr 5, 2023, 3:37 PM

#

sleek harbor *(scikit learn)* When preprocessing, you're supposed to use `fit_transform` on t...

Yeah, this does not leak the data. Nothing about the generated features is affected by the feature values of your test data

untold cliff Apr 5, 2023, 3:37 PM

#

Or np.asarray seeps more appropiate here i guess

mild dirge Apr 5, 2023, 3:37 PM

#

As long as you did not decide to use this on the basis of the features of the test data, it's fine

quartz thicket Apr 5, 2023, 3:39 PM

#

Ah, that's it! That's gotten me closer!

#

Ugh, my new fit unction isn't very good. I need to go back to using math.log somehow

violet gull Apr 5, 2023, 3:40 PM

#

How back propagate through convolutional layer?

#

PyTorch doesn’t calculate gradients directly so I have nothing to test against

quartz thicket Apr 5, 2023, 3:43 PM

#

numpy's log functions don't allow you to specify the base as far as I can tell? like math.log does. But if I've converted x to a numpy array, it doesn't seem like I can use math.log on that data

untold cliff Apr 5, 2023, 3:43 PM

#

quartz thicket Ugh, my new fit unction isn't very good. I need to go back to using math.log som...

You can use a list comprehension like [math.log(i) for i in x] and convert the result to a numpy array if you need but there's probably a better way to do that

mild dirge Apr 5, 2023, 3:44 PM

#

https://numpy.org/devdocs/reference/generated/numpy.emath.logn.html

#

This maybe?

untold cliff Apr 5, 2023, 3:45 PM

#

Oh yeah logn of emath works

quartz thicket Apr 5, 2023, 3:45 PM

#

Oh wow. I missed that one.

#

oh I see. I didn't look in emath

untold cliff Apr 5, 2023, 3:46 PM

#

Btw log(base)(x) is just log(x) / log(base)

quartz thicket Apr 5, 2023, 3:47 PM

#

untold cliff Btw log(base)(x) is just log(x) / log(base)

I think I had that on a test yeaars ago 🙂

untold cliff Apr 5, 2023, 3:48 PM

#

quartz thicket I think I had that on a test yeaars ago 🙂

Haha, well it seems you have never used it outside that test.

quartz thicket Apr 5, 2023, 3:50 PM

#

hmm.... I think its failing now because of a divide by 0. Is there a way I can the variable in the denominator isn't allowed to be a 0 when doin curve_fit?

untold cliff Apr 5, 2023, 3:52 PM

#

Where does the division happen ?

quartz thicket Apr 5, 2023, 3:52 PM

#

In my model function

def bestGuess(x, a, b, c):
    x = np.asarray(x)
    comp = a * np.emath.logn(x, 1/b) + c
    return comp

#

hmmm... perhaps using np.divide instead?

untold cliff Apr 5, 2023, 3:56 PM

#

You mean 1 / b ?

quartz thicket Apr 5, 2023, 3:58 PM

#

yeah

untold cliff Apr 5, 2023, 4:02 PM

#

Well i dont think you're passing a base 0 to your function, so no zero division error here

#

The problem might be somewhere else

quartz thicket Apr 5, 2023, 4:03 PM

#

No, that's where it is. I'm not passing a 0 directly. but curve_fit tries a bunch of variables in a b and c to fit the curve.

#

At least I think so

#

RuntimeWarning: divide by zero encountered in divide
return nx.log(x)/nx.log(n)

#

so maybe it's in emath.logn

untold cliff Apr 5, 2023, 4:07 PM

#

Hmm, i dont know how curve_fit works but if tye problem is as you said then you can specify bounds for the parameters to be tried

#

Refer to the documentation of the function, bounds parameter, might be the solution

#

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html

untold cliff Apr 5, 2023, 4:13 PM

#

quartz thicket > RuntimeWarning: divide by zero encountered in divide > return nx.log(x)/nx....

Hmm if emath.logn returns this, then the problem is in nx.log(n) which happens to be 0 when you pass 1 to it, so try excluding 1 from base bounds maybe

quartz thicket Apr 5, 2023, 4:24 PM

#

I think you're very likely correct about bounds. But that's just giving me a new warning, regardless of what I set bounds to. Curve fit is calling least_squares, and that's giving me:

ValueError: Residuals are not finite in the initial point.
And that's after trying a number of different bounds. Right now I'm using:

popt, pcov = curve_fit(bestGuess, Msol, M_v, bounds=([np.NINF, 2,np.NINF], np.PINF))

But even if I just use big and small ints, I get the same ValueError

#

like bounds=([999,2,999], 999)

untold cliff Apr 5, 2023, 4:48 PM

#

Hmm, then the problem is in the first point of your input and the value your function returns

#

Like of its negative or something or very close to p then you'll get infinity because there's no log(0)

#

Try inspecting it, this might be the problem

quartz thicket Apr 5, 2023, 4:51 PM

#

untold cliff Try inspecting it, this might be the problem

Yeah, that does seem possible. What do you mean by inspecting?

untold cliff Apr 5, 2023, 4:52 PM

#

See the output of your function for the first value

quartz thicket Apr 5, 2023, 4:56 PM

#

Ah I see. Printing the comp value does have a -Inf burried in it. Hmmm

#

Ah, because there is a 1 in that same index of Msol

mint palm Apr 5, 2023, 4:57 PM

#

why do loss function's mathematical form always have y binary variable but we dont have y in unsupervised learning, But i dont see their unsupervised way mentioned

wooden sail Apr 5, 2023, 4:59 PM

#

that's because in unsupervised learning you don't have a reference y

#

you have some function of the input that is then inverted

#

the target is not norm(y - f(x))

mint palm Apr 5, 2023, 4:59 PM

#

wooden sail that's because in unsupervised learning you don't have a reference y

i am confused because i was asked by my prof. to applly contrastive loss of tensor of shape (bs*tokens, embedding size) and told he after contrastive loss i should have (bs*tokens) values

wooden sail Apr 5, 2023, 4:59 PM

#

it's norm(x - g(f(x)))

mint palm Apr 5, 2023, 5:00 PM

#

i dont see how, what he said is possible

quartz thicket Apr 5, 2023, 5:00 PM

#

I guess the easiest solution is just deleting that point. However its a shame as its the single most certain point.

wooden sail Apr 5, 2023, 5:01 PM

#

mint palm i am confused because i was asked by my prof. to applly contrastive loss of tens...

idk what contrastive loss is

#

ok, so a quick google search says its an example of what i said

#

you give an input x, and you want to produce another x that is close to it

mint palm Apr 5, 2023, 5:03 PM

#

hmm, actually its a little deep, google search probably wont be helpful here.
My understanding is it takes triplet, BUT triplet/ranking/ etc seems same but are diff.
I dont know still how he wanted me to make pairs

quartz thicket Apr 5, 2023, 5:03 PM

#

Now I just need to figure out why my fitment is absolute hot garbage 🤣 😭

#

wooden sail Apr 5, 2023, 5:07 PM

#

mint palm i am confused because i was asked by my prof. to applly contrastive loss of tens...

the wording in this is a little confusing. wdym by "after the loss"? the loss functions most commonly employed are scalar-valued. you can average over several examples, sure

#

if you can word the problem a bit more generally, i can give you a hand. i'm somewhat familiar with optimization tasks. i don't know much specifically about this task you're talking about, but off the top of my head it seems similar to what nonlinear component analysis does: try to learn a metric

mint palm Apr 5, 2023, 5:09 PM

#

wooden sail the wording in this is a little confusing. wdym by "after the loss"? the loss fu...

in hard mining and other scenarios, you have the option to do mean/sum/max at vector of values
so values means just a step before scaler was calculated

wooden sail Apr 5, 2023, 5:10 PM

#

mint palm in hard mining and other scenarios, you have the option to do mean/sum/max at ve...

this is very different from what you said your prof told you

#

this is BEFORE computing the loss

mint palm Apr 5, 2023, 5:10 PM

#

i can explain, one sec.

#

So, lets talk about Noise Contrastive Estimation first, it is very similar to something like UNSUPERVISED cross entropy loss with temp. what we do is following:
you need
anchor (bs, embed size)
positive (bs, embed size)
negative (bs, embed size) (optional, can also you positive from other pair
then do
res = anchor @ positive.t()
Now along diagonal you have similarity score from correct pair, other positions have incorrect pair
if you do vec = res.diag() you have a vector, Now you can do hard mining by doing max(vect)
now here values accounts the number of scaler in diagonal.

#

@wooden sail i hope this connect dots of contrastive, values, etc

wooden sail Apr 5, 2023, 5:18 PM

#

this info does not help at all 😛

#

what do anchor, positive, negative, res, bs, and embed size mean?

#

we can talk about unsupervised learning completely separately from your specific application and cost function, as it's a framework

mint palm Apr 5, 2023, 5:20 PM

#

wooden sail what do anchor, positive, negative, res, bs, and embed size mean?

can assume NLP task, anchor is just test, positive would be similar text, bs is BATCH size, embed size is size of representation of token

wooden sail Apr 5, 2023, 5:21 PM

#

ok, sure, you're taking an inner product and applying cauchy-schwarz

quartz thicket Apr 5, 2023, 5:22 PM

#

Hmm... getting closer with

comp = a * b ** -x + c

#

but still pretty off the mark

mint palm Apr 5, 2023, 5:22 PM

#

wooden sail ok, sure, you're taking an inner product and applying cauchy-schwarz

i forgot what that was, i read it 2 years back

wooden sail Apr 5, 2023, 5:22 PM

#

mint palm i forgot what that was, i read it 2 years back

it's the only thing you needed to explain the scenario 😛

quartz thicket Apr 5, 2023, 5:22 PM

#

#

I'm still pretty off target, as Msol gets larger. But this kinda feels like magic

untold cliff Apr 5, 2023, 5:24 PM

#

quartz thicket

This looks a lot like the function 1/x but shifted

wooden sail Apr 5, 2023, 5:25 PM

#

quartz thicket I don't think numpy.log let's be define the base, so I resorted to math.log, but...

you can use a change of base with logs

quartz thicket Apr 5, 2023, 5:25 PM

#

untold cliff This looks a lot like the function 1/x but shifted

I thuought so too. But adding a 1/ in there didn't doo much for me last I tried, I think because a scales it

mint palm Apr 5, 2023, 5:25 PM

#

i think i will stay confused until prof explain, lmao

quartz thicket Apr 5, 2023, 5:26 PM

#

wooden sail you can use a change of base with logs

Yeah, @untold cliff got me hooked up with np.emath.logn() which is what I needed. And showed me the old base change formula too.

wooden sail Apr 5, 2023, 5:27 PM

#

all right

#

also fyi, your problem is not convex. it's littered with local optima and your result will depend on your initial guess of the parameters

quartz thicket Apr 5, 2023, 5:28 PM

#

Yeah, I'm doing my best to narrow in that initial guess. What do you mean by local optima?

wooden sail Apr 5, 2023, 5:28 PM

#

places where the gradient becomes zero and the hessian is positive definite, but are not the true solution

quartz thicket Apr 5, 2023, 5:29 PM

#

Ah, I see. It's all the data I have to work with though.

wooden sail Apr 5, 2023, 5:29 PM

#

it has nothing to do with the data

#

it's the model 😛

quartz thicket Apr 5, 2023, 5:30 PM

#

You mean my model function?

wooden sail Apr 5, 2023, 5:30 PM

#

yep

quartz thicket Apr 5, 2023, 5:30 PM

#

Not sure how to find one that is more likely to work?

#

comp = a * b/x + c yields this:

wooden sail Apr 5, 2023, 5:31 PM

#

it's just a lesson that optimization is difficult. even if you know perfectly what the model is, it may be impossible to find the parameters

quartz thicket Apr 5, 2023, 5:31 PM

#

quartz thicket Apr 5, 2023, 5:31 PM

#

wooden sail it's just a lesson that optimization is difficult. even if you know perfectly wh...

https://tenor.com/view/dont-say-that-shut-up-shh-silence-you-cant-say-that-gif-17484258

#

Hmmm... maybe I can use the covariance matrix it produces to to get better results?

quartz thicket Apr 5, 2023, 5:49 PM

#

wooden sail it's just a lesson that optimization is difficult. even if you know perfectly wh...

Any other suggested next steps? It feels like

comp = a * b ** x + c

Is pretty close as far as the shape is concerned

wooden sail Apr 5, 2023, 5:51 PM

#

i don't really have any recommendations. how about a/x + c

quartz thicket Apr 5, 2023, 5:51 PM

#

wooden sail i don't really have any recommendations. how about a/x + c

That's this one: #data-science-and-ml message

untold cliff Apr 5, 2023, 5:53 PM

#

quartz thicket That's this one: https://discord.com/channels/267624335836053506/366673247892275...

Did you try setting the bounds for a and c ?

quartz thicket Apr 5, 2023, 5:53 PM

#

Hmm... actually, I think restring the bounds of a b and c might...

quartz thicket Apr 5, 2023, 5:53 PM

#

untold cliff Did you try setting the bounds for a and c ?

lol. yeah thrying to be more clever about that now.

untold cliff Apr 5, 2023, 5:53 PM

#

Try some online plotting tool, playvwoth therz values a little to get a feeling of the bounds

sleek harbor Apr 5, 2023, 5:54 PM

#

mild dirge Yeah, this does not leak the data. Nothing about the generated features is affec...

So doing it like this is fine?

`poly = PolynomialFeatures([degree=2])
X_poly = poly.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X_poly, y[, test_size=0.25, random_state=None])

lr = LinearRegression([positive=False])
lr.fit(X_train, y)

Yhat = lr.predict(X_test)`

untold cliff Apr 5, 2023, 5:54 PM

#

Btw, you only need a / x + c, no need for be as it would be mutliplied by a and thus just another coefficient

wooden sail Apr 5, 2023, 5:57 PM

#

adding an extra parameter that cannot be separated makes the estimation task more difficult

#

try with just parameters a and c

quartz thicket Apr 5, 2023, 5:57 PM

#

omg!

#

#

It's beyoooooteeful!

wooden sail Apr 5, 2023, 5:58 PM

#

nice. which model?

quartz thicket Apr 5, 2023, 5:58 PM

#

comp = a / x**b + c

wooden sail Apr 5, 2023, 5:58 PM

#

ooh

#

what value of b did you get

quartz thicket Apr 5, 2023, 5:59 PM

#

[ 19.12270031 0.2224922 -14.09644528]

wooden sail Apr 5, 2023, 5:59 PM

#

interesting

untold cliff Apr 5, 2023, 5:59 PM

#

Congratulations 👏🎉

quartz thicket Apr 5, 2023, 5:59 PM

#

Thank you @wooden sail and @untold cliff for all your help!

fiery jungle Apr 5, 2023, 6:30 PM

#

hey,
could someone look at this tutorial and explain to me why 128 exactly ?? does it matter if i changed it ? https://youtu.be/bemDFpNooA8?list=PLQY2H8rRoyvwWuPiWnuTDBHe7I0fMSsfO&t=198

#

im still trying to wrapp my head around AI

mild dirge Apr 5, 2023, 6:33 PM

#

It's a bit arbritrary, some values will work better than others. @fiery jungle

#

It just means that layer has 128 nodes

fiery jungle Apr 5, 2023, 6:35 PM

#

mild dirge It's a bit arbritrary, some values will work better than others. <@7115810142076...

ah ok , that means i could try something like 64 or 256 and get a better results , right ? there is no real logic behind it or a certain principle to follow or calculate, correct ?

mild dirge Apr 5, 2023, 6:37 PM

#

There is a bit of logic behind it. The value can determine a lot of behavior of your model, and there is a bit of predictability. Often people try a few values, see how well the model performs, and then stick with it.

#

Lower values will generally give less of a chance to overfit than higher values, but higher values are more likely to model a more complex function.

fiery jungle Apr 5, 2023, 6:38 PM

#

holly cow look what chatgpt said when i feed him the same question !!! did he just watch a youtube video ??

fiery jungle Apr 5, 2023, 6:39 PM

#

mild dirge Lower values will generally give less of a chance to overfit than higher values,...

thank you so much ❤️ ❤️

mild dirge Apr 5, 2023, 6:40 PM

#

#

I'm watching a tutorial on a game that has nothing to do with skateboarding..

#

Are you using gpt4?

fiery jungle Apr 5, 2023, 6:41 PM

#

mild dirge Are you using gpt4?

yes !

#

use it on the browser, some apps claims that they have chat gpt4 but they actually have a crappy old version of it.

mild dirge Apr 5, 2023, 6:42 PM

#

No this is just gpt3

fiery jungle Apr 5, 2023, 6:42 PM

#

yes with better model

#

i litterally just copied and pasted the same message i sent here as shown in the picture

#

i wonder if it analized the video or just read context of the subtitles

agile cobalt Apr 5, 2023, 6:44 PM

#

try asking it what is the title of the video, then try again with a more recent video from after the training data cutoff

fiery jungle Apr 5, 2023, 6:45 PM

#

hmm ok i will ask for a recently released video

#

#

https://youtu.be/WZav2l0g724?t=142

#

he got it right !!

#

that was released 19h ago

#

are we in the future ? 😄

agile cobalt Apr 5, 2023, 6:58 PM

#

huh, pretty neat and a little surprising, but to be fair considering the features the BingGPT and GPT4 demos promised it does seems reasonable

fiery jungle Apr 5, 2023, 7:01 PM

#

what is concerning is , if this massive futuristic genius is what they have released to the public, .... what are they hiding up their sleeves?

violet gull Apr 5, 2023, 7:04 PM

#

Edd is no longer Edd

#

why

wooden sail Apr 5, 2023, 7:05 PM

#

it has been brought to my attention that 40 out of 43 (i guess 41 out of 44 now) conversations involving toeplitz matrices in this server have involved me

violet gull Apr 5, 2023, 7:06 PM

#

Edd how convolution backpropagation

wooden sail Apr 5, 2023, 7:06 PM

#

ah that's pretty rough

#

coincidentally, the easiest way would be to either pass through a large toeplitz matrix and rewrite that as a sum, or directly rewrite the convolution as a sum, then differentiate that

violet gull Apr 5, 2023, 7:07 PM

#

how pytorch does it

#

?

wooden sail Apr 5, 2023, 7:07 PM

#

it builds a computational graph and uses that to only differentiate simple functions

#

in this case, since convolution is linear, it's addition and multiplication composed with whatever activation function you use

violet gull Apr 5, 2023, 7:08 PM

#

how get same results as pytorch but by using actual derivatives

wooden sail Apr 5, 2023, 7:09 PM

#

pytorch uses actual derivatives

wooden sail Apr 5, 2023, 7:09 PM

#

wooden sail coincidentally, the easiest way would be to either pass through a large toeplitz...

but you can always do this

violet gull Apr 5, 2023, 7:09 PM

#

easiest isnt best

wooden sail Apr 5, 2023, 7:09 PM

#

the best is to build your own computational graph, if you want to be able to arbitrarily nest your functions

violet gull Apr 5, 2023, 7:10 PM

#

what is that

wooden sail Apr 5, 2023, 7:10 PM

#

https://blog.paperspace.com/pytorch-101-understanding-graphs-and-automatic-differentiation/

Paperspace Blog

PyTorch Basics: Understanding Autograd and Computation Graphs

In this article, we dive into how PyTorch's Autograd engine performs automatic differentiation.

violet gull Apr 5, 2023, 7:10 PM

#

ok i dont want to do that

#

i thought conv backprop involved reversing it and then doing a convolution with a padded matrix

wooden sail Apr 5, 2023, 7:10 PM

#

you can't avoid it. from this point on, it's ALL math. your only choice is how to do the math

#

idk, maybe that's the case for a single layer. the only way to show that is by doing what i told you

#

write it explicitly as a sum, differentiate that, and see what the resulting structure is

violet gull Apr 5, 2023, 7:12 PM

#

mmmm

#

idk what that mean

wooden sail Apr 5, 2023, 7:12 PM

#

i'm not going to do it for you

violet gull Apr 5, 2023, 7:12 PM

#

i need like a step by step

wooden sail Apr 5, 2023, 7:13 PM

#

https://towardsdatascience.com/backpropagation-in-a-convolutional-layer-24c8d64d8509 maybe check this out? i usually don't like their blogs due to poor quality control, but take a look

Medium

Backpropagation in a convolutional layer

The aim of this post is to detail how gradient backpropagation is working in a convolutional layer of a neural network.

#

https://iopscience.iop.org/article/10.1088/1742-6596/1004/1/012027/pdf this looks nice

#

it's really a lot easier to notice that convolution is multiplication by a toeplitz matrix

#

this immediately tells you how to write the derivatives

violet gull Apr 5, 2023, 7:15 PM

#

my friend say Toeplitz is a made up word

wooden sail Apr 5, 2023, 7:15 PM

#

you lost my attention

violet gull Apr 5, 2023, 7:16 PM

#

so if i make a computational graph

#

it will tell me if my backpropagation is working right

#

ima do that

verbal venture Apr 5, 2023, 7:21 PM

#

can someone explain to me from a software engineering + deep learning perspective what is going on in these lines of code? py for inputs, targets in test_dataloader: outputs = model(inputs) # output of the model _, predictions = torch.max(outputs, 1) . what type of tensor should outputs be returning? what's the relevance of the _ . And I understand that predictions returns an array of the correct classifications, but how is that working from a coding perspective?

violet gull Apr 5, 2023, 7:24 PM

#

verbal venture can someone explain to me from a software engineering + deep learning perspectiv...

"_" means torch,max produces 2 values but only the second one is of importance

verbal venture Apr 5, 2023, 7:24 PM

#

okay, but what data is in _

violet gull Apr 5, 2023, 7:24 PM

#

nothing

#

its the same as putting a variable name there and never using the variable

#

for inputs, targets in test_dataloader:: This line initiates a for loop that iterates over the batches of data in the test dataset. The test_dataloader object is likely an instance of the PyTorch DataLoader class, which provides an iterable interface to a dataset by batching and shuffling the data.

outputs = model(inputs): This line passes the input batch inputs through the trained machine learning model model to obtain the predicted outputs. The outputs tensor likely has the same shape as the targets tensor.

__ , predictions = torch.max(outputs, 1): This line uses the PyTorch torch.max() function to find the maximum value and corresponding index along the second dimension (i.e., the dimension corresponding to the number of classes) of the outputs tensor. The maximum value is discarded (indicated by the _ variable) and the predicted class labels are stored in the predictions tensor. The predictions tensor is likely a one-dimensional tensor with the same number of elements as the inputs and targets tensors.

verbal venture Apr 5, 2023, 7:28 PM

#

so predictions != labels yeah? it's just the associated index for each of the maximum tensors? and the max tensors are the tensors returned from the outputs after they've been run in the model, so it's the most probably classification index yeah?

violet gull Apr 5, 2023, 7:29 PM

#

yes

sleek harbor Apr 5, 2023, 7:41 PM

#

The need to standardize X isn't accompanied by the need to standardize y, right?
Well is there ever a need to standardize y?

serene scaffold Apr 5, 2023, 7:43 PM

#

sleek harbor The need to standardize *X* isn't accompanied by the need to standardize *y*, ri...

is this in response to something else?

sleek harbor Apr 5, 2023, 7:44 PM

#

serene scaffold is this in response to something else?

Nah, it's just a question

serene scaffold Apr 5, 2023, 7:44 PM

#

sleek harbor Nah, it's just a question

you need to be consistent in how you represent everything in your ML code

sleek harbor Apr 5, 2023, 7:49 PM

#

serene scaffold you need to be consistent in how you represent everything in your ML code

I get that you should standardize independent variables where needed, for KNN for example, but should you standardize the dependent variable as well?

serene scaffold Apr 5, 2023, 7:50 PM

#

sleek harbor I get that you should standardize independent variables where needed, for KNN fo...

what do you mean by "standardize", in this context?

#

and what type of data are the independent and dependent variables?

sleek harbor Apr 5, 2023, 7:51 PM

#

serene scaffold what do you mean by "standardize", in this context?

x-mean/std

sleek harbor Apr 5, 2023, 7:52 PM

#

serene scaffold and what type of data are the independent and dependent variables?

Idk, numbers? I don't have a particular example, I'm just thinking theoretically. Like, should I ever standardize the dependent variable

serene scaffold Apr 5, 2023, 7:54 PM

#

sleek harbor Idk, numbers? I don't have a particular example, I'm just thinking theoretically...

if your X data are numbers, one normalization technique is to rescale all the numbers so the maximum value is 1 and the minimum value is 0. and if your y data are also numbers, you could do the same thing (though the scale will be different). but you can't do that if your y data are categories, for example.

versed gulch Apr 5, 2023, 7:54 PM

#

Are there any radius detection algorithms used for 3D images in python?

sleek harbor Apr 5, 2023, 7:56 PM

#

serene scaffold if your X data are numbers, one normalization technique is to rescale all the nu...

I get that. But what I don't get is, why would I want/need to normalize the y data in the first place? What will that achieve? Won't the results be the same as if I didn't do anything with the y data?

serene scaffold Apr 5, 2023, 7:57 PM

#

sleek harbor I get that. But what I don't get is, why would I want/need to normalize the y da...

the same reasons as normalizing the x data.

sleek harbor Apr 5, 2023, 7:59 PM

#

serene scaffold the same reasons as normalizing the x data.

But.. isn't the point of normalizing x data in preventing one feature from dominating the other? Whereas there's only 1 y variable, so.. there's nothing to dominate over, so it shouldn't matter?

raw compass Apr 5, 2023, 8:10 PM

#

can you guys recommend books for maths especially in machine learning?

high folio Apr 5, 2023, 8:22 PM

#

raw compass can you guys recommend books for maths especially in machine learning?

Mathematics for machine learning

verbal venture Apr 5, 2023, 8:29 PM

#

hey, my predictions from torch.max() is returning [3, 1, 2, 2, 1, 0, 1, 1, 0, 0, 1, 3, 0, 0, 0, 3, 3, 1, 1, 0, 1, 3, 1, 1, 2, 3, 1, 0, 1, 3, 2, 2]). My model is supposed to be a multilabel classification. I'm not sure how to return a label

serene scaffold Apr 5, 2023, 8:38 PM

#

verbal venture hey, my predictions from torch.max() is returning `[3, 1, 2, 2, 1, 0, 1, 1, 0, 0...

looks like this is the second of the two return values we talked about last week. it represents the index of the maximum value, for each row

#

so it only gives you one per row. you have to know what class each number represents, perhaps by keeping it in a dictionary.

#

when you say "multilabel", you mean that each instance can have more than one label/class, yes?

verbal venture Apr 5, 2023, 8:41 PM

#

I have 4 classes of eye diseases (indexes 0 -> 3). It should return only 1, but there are 4 possible labels

serene scaffold Apr 5, 2023, 8:42 PM

#

so that's what [3, 1, 2, 2, 1, 0, 1, 1, 0, 0, 1, 3, 0, 0, 0, 3, 3, 1, 1, 0, 1, 3, 1, 1, 2, 3, 1, 0, 1, 3, 2, 2] is

verbal venture Apr 5, 2023, 8:42 PM

#

why are there so many values?

austere swift Apr 5, 2023, 9:02 PM

#

verbal venture can someone explain to me from a software engineering + deep learning perspectiv...

outputs is a tensor of whatever shape the model outputs (that's defined in the model itself). The _ signifies that you're not using the values of the outputs, just the indices (torch.max returns values and indices).

#

so predictions will be a tensor of indices corresponding to the maximum value of each row (since you specified dim as 1, which is the second argument to torch.max)

#

if you specified dim as 0, it would be the maximum value of each column

#

and so on

#

The reason you do that is because the model returns a batch of outputs, which is a tensor of size (batch_size, num_classes), so getting the maximum value of each row will give you the prediction of each batch

austere swift Apr 5, 2023, 9:05 PM

#

verbal venture hey, my predictions from torch.max() is returning `[3, 1, 2, 2, 1, 0, 1, 1, 0, 0...

there's multiple labels since you have a batch size more than 1

#

so each label there corresponds to the input batch's samples

brazen cipher Apr 5, 2023, 9:06 PM

#

Hello! So for my particular project, I would need Anaconda version 3.9 however only Anaconda 3.10 is available. Is there a place where I can get version 3.9?

austere swift Apr 5, 2023, 9:06 PM

#

brazen cipher Hello! So for my particular project, I would need Anaconda version 3.9 however o...

when you create a conda environment you can set it to use python 3.9

#

it would just be conda create -n env_name python=3.9

brazen cipher Apr 5, 2023, 9:06 PM

#

Gotcha, thanks!

austere swift Apr 5, 2023, 9:07 PM

#

the anaconda version only determines the python version of the base environment

lapis sequoia Apr 5, 2023, 9:07 PM

#

guys, i wanna try code a bot that reppetetively trys claim an xbox gamertag until its claimed, dm me please.

austere swift Apr 5, 2023, 9:08 PM

#

that sounds like it would be a breach of the xbox ToS

#

which would break rule 5

#

!rule 5

arctic wedgeBOT Apr 5, 2023, 9:09 PM

#

Rules

5. Do not provide or request help on projects that may break laws, breach terms of services, or are malicious or inappropriate.

upper verge Apr 5, 2023, 9:12 PM

#

🤯

grand sinew Apr 5, 2023, 9:47 PM

#

#

how could one plot an envelope like this?

#

the hilbert function from scipy only plots one for the maxima

lapis sequoia Apr 5, 2023, 9:50 PM

#

ooooo wavy

untold cliff Apr 5, 2023, 10:15 PM

#

grand sinew

I think you could take the negative of your function as it would flip it with respect to the x-axis, and then apply this helbert function and take its negative as well

wooden sail Apr 5, 2023, 10:20 PM

#

the envelope should be symmetric

#

take the envelope and multiply it by -1

grand sinew Apr 5, 2023, 10:23 PM

#

untold cliff I think you could take the negative of your function as it would flip it with re...

that's a good shout
will try that!

wooden sail Apr 5, 2023, 10:26 PM

#

!e

import numpy as np
from scipy.signal import hilbert
import matplotlib.pyplot as plt

fs = 1000
Nt = 50
t = np.arange(Nt)/fs

carrier = np.cos(2*np.pi*100*t)
envelope = np.cos(2*np.pi*10*t + 120*np.pi/180) +\
    2*np.cos(np.pi*5*t + 31*np.pi/180) +\
    5*np.cos(np.pi*17*t + 47*np.pi/180)
signal = carrier*envelope

estimated_envelope = np.abs(hilbert(signal))

plt.plot(t, signal)
plt.plot(t, estimated_envelope)
plt.plot(t, -estimated_envelope)
plt.legend(("signal","positive envelope","negative envelope"))
plt.savefig("BIG_OOF.png")

#

geez

arctic wedgeBOT Apr 5, 2023, 10:26 PM

#

@wooden sail :warning: Your 3.11 eval job timed out or ran out of memory.

[No output]

wooden sail Apr 5, 2023, 10:27 PM

#

seriously?

#

#

here's a demo, anyway @grand sinew

thorn swift Apr 5, 2023, 10:29 PM

#

anybody know any good pretrained text classifiers?

grand sinew Apr 5, 2023, 10:29 PM

#

wooden sail here's a demo, anyway <@185851161134366720>

beautiful

untold cliff Apr 5, 2023, 10:29 PM

#

wooden sail

That's nice. Is this a property of trig functions ?

wooden sail Apr 5, 2023, 10:30 PM

#

untold cliff That's nice. Is this a property of trig functions ?

which one?

untold cliff Apr 5, 2023, 10:30 PM

#

wooden sail which one?

Symmetry of the envelope

wooden sail Apr 5, 2023, 10:30 PM

#

no

#

well. it's a property of modulated carriers, and carriers are trig functions. so in that sense, sure.

#

but usually speaking of envelopes already implies you're working with modulated carriers

#

if you have a trigonometric func as a carrier, call it c(t), and an envelope e(t), then the signal e(t)c(t) has a symmetric envelope, since e(t) becomes the amplitude of the sinusoid in the carrier

untold cliff Apr 5, 2023, 10:36 PM

#

Aah i see. Thanks!

wooden sail Apr 5, 2023, 10:40 PM

#

made it a little better-looking

fiery jungle Apr 5, 2023, 11:24 PM

#

hey,
i wonder if AI requires lots of permanent memory to operate, like if the dataset is on a cloud , does it still requires a big hard drive to operate?
and how about memory? ... could a regular computer or an old device host something like chatgpt since all its dataSets are hosted online ?

queen cradle Apr 6, 2023, 12:45 AM

#

untold cliff I see, that's makes sense. Are there any cases where imputing with the mean (or ...

I'm not aware of any. I can imagine a situation where this could be construed as reasonable—if you know that the missing part of the data is independent of the rest, and if that missing part is pretty tight around its mean (median, etc.). But in that case, why impute data? So I don't know of any situation where the procedure you describe is actually a good idea. (Besides which, my general advice for imputation is to avoid it when possible. It's very easy to make mistakes that affect your analysis.)

queen cradle Apr 6, 2023, 1:02 AM

#

untold cliff That's nice. Is this a property of trig functions ?

In the most general setting, this is really a property of things that you can take the Hilbert transform of. Suppose you have a function f(z) which is holomorphic in a neighborhood of the closure of the upper half plane. For a real number x, define g(x) = Re f(x). Assume g is continuously differentiable. Also assume that f satisfies a decay condition as |z| -> infty, e.g., f(z) = O(|z|^{-1}). Then, for x on the real axis, (Im f)(x) is the Hilbert transform of g(x). By combining g and its Hilbert transform, you can compute f on the real axis (general properties of holomorphic functions ensure there's a unique extension to a neighborhood of the closure of the upper half plane but say nothing about how to compute it). The reason why this gives you something that looks like an envelope is because cos x is the real part of e^{ix}, which has constant absolute value. That is, the Hilbert transform tells you how to fill in sinusoidal wiggles!

untold cliff Apr 6, 2023, 1:10 AM

#

queen cradle I'm not aware of any. I can imagine a situation where this could be construed as...

Could you give examples of these mistakes? Like including the target variable in the features used for imputation maybe?

queen cradle Apr 6, 2023, 1:12 AM

#

The biggest problem is that whether or not you measured a data value may have something to do with that value.

#

For instance, I gave a time series example earlier where the sensor stopped working when the reading was too high.

grand sinew Apr 6, 2023, 1:15 AM

#

wooden sail seriously?

realised I didn't even need to use the funky Hilbert function or anything else
as my function is bounded by a cos term 😐
so the max and min are just +/- of the other part of the function

queen cradle Apr 6, 2023, 1:16 AM

#

Another case turns up when doing surveys. For example, imagine that you're doing political polls. You dial a number. They answer, but when you get halfway through the poll and ask them about a really sensitive topic, they hang up. Why? Ideally for the pollster, people hang up totally randomly, for reasons completely unconnected to their opinions. In reality, that's not true. Some people are more likely to answer polls than others. So when you have a missing data point—a person who hung up midway through—it's quite likely that imputing missing responses based on other people's responses will be wrong.

quartz thicket Apr 6, 2023, 2:47 AM

#

I'm using pyomo with the ipopt solver. I want to define a derived variable using an in/elif/else clause, or something similar that could produce the same results. If it didn't have to be compatible with the solver, I'd just use this python function:

def star_lum(mass):
    """
    Calculates the luminosity of a star with the given mass.
    """
    # Define the Mass-Luminosity Relation
    if mass < 0.43:
        luminosity = 0.23 * mass ** 2.3
    elif mass < 2:
        luminosity = mass ** 4
    elif mass < 20:
        luminosity = 1.5 * mass ** 3.5
    else:
        luminosity = 32000 * mass

    return luminosity

But I can't include function calls in derived variable definitions.

I've looked into piecewise() but I don't think it'll actually work, (unless I'm misunderstanding it or doing it wrong.)

hasty mountain Apr 6, 2023, 4:37 AM

#

Guys, I'm beginning to write some research papers on AI and Deep Learning, and I wanted to know...can someone recommend me an app or another way to make sketches and schemes to illustrate concepts?
Making those at Paint 3D feels a bit too amateur...even for an amateur yert

untold cliff Apr 6, 2023, 4:43 AM

#

queen cradle Another case turns up when doing surveys. For example, imagine that you're doing...

Ah yeah this is the missing not at random case right? Where we have to figure out the reason for the missingness in order to see if we can do imputation?

iron basalt Apr 6, 2023, 5:03 AM

#

hasty mountain Guys, I'm beginning to write some research papers on AI and Deep Learning, and I...

Make a plot with something like GNU Plot, export to SVG, and import into Inkscape.

hasty mountain Apr 6, 2023, 5:10 AM

#

iron basalt Make a plot with something like GNU Plot, export to SVG, and import into Inkscap...

Ugh...seems a bit complicated, but thanks!

#

At least Inkscape seems better than Paint 3D

arctic crown Apr 6, 2023, 5:14 AM

#

In a Neural Network why do we use multiple nodes? if the nodes have the same activation function isnt it useless to connect every input to ever node because we would get the same value for that input from every node? is each node calculating something different?

hasty mountain Apr 6, 2023, 5:15 AM

#

arctic crown In a Neural Network why do we use multiple nodes? if the nodes have the same act...

Same activation function doesn't mean same output

#

If your first node is applying the operation 1x5, and your activation function is a ReLU, its output will be different from your second node that is applying the operation 5x-1 with ReLU function.

#

ReLU(1x5) = 5
ReLU(5 x -1) = 0

arctic crown Apr 6, 2023, 5:28 AM

#

How come the function changed?

#

From 1x5 to 5x-1

stone marlin Apr 6, 2023, 5:41 AM

#

Howdy, y'all. I work primarily in the ds/ml/mle space right now and have Python as a daily driver. I've got some free time comin' up and was checking out some things I could learn. One of them is GoLang. So, here's a very soft question related to that.

Q: Does anyone have experience working with Go and Python together in the ds/ml/mle space? Are there common ds or data engineering problems that you feel Go is particularly good at either by itself or coupled with Python?

hasty mountain Apr 6, 2023, 5:52 AM

#

arctic crown How come the function changed?

The Rectified Linear function is defined as:
f(x) = 0 if x <= 0 ; else f(x) = x

#

And the first node would execute the operation 1x5, while the second one, 5 x -1

#

So, you'd have:

1 x 5 (node 1) = 5 ----> ReLU(5) = 5 ---> 5 x -1 (node 2) ---> ReLU(-5) = 0

grand canyon Apr 6, 2023, 5:58 AM

#

i had a question

#

#

this is a graph of training loss vs epocchs, as well as validation accuracy vs epochs

#

why does loss fluctuate so much

#

even though it gets lower over time and accuracy increases over time

stone marlin Apr 6, 2023, 6:04 AM

#

It's difficult to tell why this would be the case just from looking at the graphs. The most common thing in my (limited) experience was a batch size which is too small or not representative of the population. tl;dr, maybe try increasing batch_size and see what happens.

grand canyon Apr 6, 2023, 6:06 AM

#

ill try doing that

grand canyon Apr 6, 2023, 6:12 AM

#

stone marlin It's difficult to tell why this would be the case just from looking at the graph...

i think i get it

#

i had a 100000 data set

#

i made a mini sample

#

of 1,000 images

#

and ran the model on it

#

resulting in greater repetitiveness and therefore more fluctuation in training loss

#

if i trained the model on the cloud using the entire dset i think i would have less volatility

#

im not sure if that's the right thought process

#

that's just what i think is the case

stone marlin Apr 6, 2023, 6:15 AM

#

I have not worked with NNs enough to know the ins-and-outs, and I do not usually work with image data. It's possible that the batches were all just very different from each other or something, I'm unsure.

wooden sail Apr 6, 2023, 6:22 AM

#

queen cradle In the most general setting, this is really a property of things that you can ta...

it's specifically a property of sinusoids though. you can make a signal that does not have zero mean and you'll notice if it entirely above the x axis, the envelope is the signal itself. then it's no longer symmetric. it has the nice symmetry property particularly due to the fourier modulating property, if you wanna think about it that way. as you wrote as well, it's due to the exponential having modulus 1 everywhere

hasty mountain Apr 6, 2023, 7:56 AM

#

Guys, when it's preferable to use KL-Divergence rather than MSE Loss?
I don't really get the difference between them. They both look like loss functions that penalizes outputs too different from the target

#

I'm considering making a model to convert mel-spectrograms into waveforms. For that, I suppose I'd have to use one of those losses...

#

Uh...now that I think about it...I may have to use none of them at all, but actually something like gaussian log likelihood yert

Still, the question remains. When should I use KLD, when MSE?

wooden sail Apr 6, 2023, 8:16 AM

#

hasty mountain Guys, when it's preferable to use KL-Divergence rather than MSE Loss? I don't re...

all cost functions do this

#

KLD measures the difference between two distributions. MLE measures the distance between estimates/data points (that were drawn from some distribution)

#

under special conditions, the two things are the same. in general, they aren't

hasty mountain Apr 6, 2023, 8:16 AM

#

Oooh, between distributions

#

So MSE would be more like element-wise operations? Like...trying to recompose an image, for example.
And KLD for probability distributions?

hasty mountain Apr 6, 2023, 8:17 AM

#

wooden sail KLD measures the difference between two distributions. MLE measures the distance...

I see

#

Now it makes more sense

wooden sail Apr 6, 2023, 8:18 AM

#

as i said, these are sometimes the same thing. it depends on how the probability of observing a specific sample depends on that sample

hasty mountain Apr 6, 2023, 8:18 AM

#

But then...gaussian likelihood loss is for probability distributions too, isn't it?

#

A probability distribution?

wooden sail Apr 6, 2023, 8:18 AM

#

so, a gaussian pdf with fixed covariance is precisely a case where the two things are the same 😛

#

the distance between two gaussian distributions boils down to something proportional to the distance between their means

#

so the KLD and MSE both yield something that looks like least squares

hasty mountain Apr 6, 2023, 8:22 AM

#

wooden sail the distance between two gaussian distributions boils down to something proporti...

Ok, so... the distance between two gaussian distributions would be...gaussian likelihood loss.
and it could also be measured by KLD.

If I sample a point from those two gaussian distributions...MSE Loss?

#

pithink

wooden sail Apr 6, 2023, 8:23 AM

#

hasty mountain Ok, so... the distance between two gaussian distributions would be...gaussian li...

gaussian likelihood loss is a maximum likelihood estimation, it's not the distance between to gaussian distributions

#

it just happens to look identical

hasty mountain Apr 6, 2023, 8:23 AM

#

pithink

wooden sail Apr 6, 2023, 8:24 AM

#

gaussian likelihood looks at the posterior probability of the parameters of one distribution, given observations of samples drawn from that same distribution

#

KLD needs samples from 2 distributions

#

the final expression looks identical, but the interpretation is different

#

so that maximizing the likelihood given some data is the same as minimizing the KLD between two distributions

#

(only for this special case)

hasty mountain Apr 6, 2023, 8:28 AM

#

So...if I want my model to receive a mel-spectrogram as input, and generate a waveform with the most likely data values...which loss should I use?

#

I'm thinking about a Variational AutoEncoder, where the Decoder generates an image based on the most likely value for each pixel.
But for the Decoder, it's used the gaussian likelihood loss.

#

Oooh, I think I'm getting it now.
I should probably make the model generate a probability distribution for each data point, sample from this distribution, and then apply a negative log likelihood, since the point sampled should be one with highest probability, right?

#

And to calibrate how my model will generate this probability distribution, I should probably generate a prbability distribution with the waveform of the original audio data, and apply KLD to compare the two distributions...I guess

#

pithink

untold cliff Apr 6, 2023, 10:54 AM

#

Do you usually check for the p-value when checking for correlation ?

mint palm Apr 6, 2023, 11:53 AM

#

indepth tutorial of MIL? all i find is high level idea, or research paper

rough lava Apr 6, 2023, 12:08 PM

#

Hey Hey people Seth Here
I am trying some text binary classification with LSTM/GRU + GloVe
And I am wondering the following :

Should I use dropout? If so how much (Read that it generally dampens the memory of LSTM)
Probably underfitting since testing accuracy is either dropping or never rising, while training is almost always too high (e.g. 80% tr- 60% test). Is there a way to know if it is solely due to small dataset size ? (1700 sentences around 300-400 words)

mild dirge Apr 6, 2023, 12:32 PM

#

Better training performance and worse test performance often suggests that you are overfitting instead of underfitting.

#

And this could indeed (partially) be caused by a small dataset

#

But also a too complex model

rough lava Apr 6, 2023, 12:37 PM

#

mild dirge Better training performance and worse test performance often suggests that you a...

Thanks for answering !
Do you have any idea about the dropout ?
I havent found a consensus on that yet for lstm
Some suggest using it, some not in lstm specifically, so im going with trial an error for the time being
I guess text classification is more prone to overfitting when starting ...? lemon_thinking

mild dirge Apr 6, 2023, 12:39 PM

#

I haven't really used lstm's specifically, but it seems that some say that it does make the model forget things that might be important, thus reducing performance. I would honestly just try it out, and see what the effect is.

#

Can dropout not be added in a different part of the model?

rough lava Apr 6, 2023, 12:49 PM

#

mild dirge I haven't really used lstm's specifically, but it seems that some say that it do...

yeah thats what ive read so far
and ive been experimenting with and without to check the acc,precision,recall
But honestly I dont think I should trust results on small datasets

rough lava Apr 6, 2023, 12:51 PM

#

mild dirge Can dropout not be added in a different part of the model?

Isnt it only useful before making the predictions(for the train part) ?

queen cradle Apr 6, 2023, 1:12 PM

#

untold cliff Ah yeah this is the missing not at random case right? Where we have to figure ou...

That's right. And unless you have a pretty clear picture of why your data is missing, it's possible that there's some non-random reason why it went missing.

arctic crown Apr 6, 2023, 1:17 PM

#

hasty mountain So, you'd have: 1 x 5 (node 1) = 5 ----> ReLU(5) = 5 ---> 5 x -1 (node 2) ---> ...

Wait so ever bode would be different? But who decides what function to put in the nodes?

glacial talon Apr 6, 2023, 1:33 PM

#

Hello everyone could you pls suggest a best use case in educational sector to solve a problem using chatgpt

queen cradle Apr 6, 2023, 1:34 PM

#

wooden sail it's specifically a property of sinusoids though. you can make a signal that doe...

No, it applies more broadly than just to sinusoids. But while the thing I described is some sort of envelope, it may not reflect our intuitive sense of what an envelope ought to be. Plus, the conditions of the theorem that I'm invoking are a little delicate; for example, if your signal has a non-zero mean, then it doesn't decay fast enough for the theorem to apply. In fact the theorem doesn't apply to pure tones. For example, the real part of f(z) = ie^{z^2} is zero on the real axis, while the imaginary part blows up; clearly you can add this f(z) to a pure tone, so there's more than one holomorphic function which has a real part on the real axis which equals the pure tone; that is, without a decay condition, the analytic signal is not unique. Any time you use a Hilbert transform to construct an analytic signal, you're either making some kind of assumption that things behave nicely, or you're saying that the only analytic signals you care about are the ones where the imaginary part is determined by the Hilbert transform (an assumption that's fine in practice).

wooden sail Apr 6, 2023, 1:41 PM

#

well yeah, i was talking only about the envelope as a curve modulating a sinusoid, which is the common definition. indeed the hilbert transform is a more general relationship between the real and imaginary parts of analytic functions on the upper half complex plane. just that you don't get the nice symmetry in general

rough lava Apr 6, 2023, 1:47 PM

#

Hardest part about having a bilingual education is half the terms come to mind in english and half in my language
Which at times can be confusing af

queen cradle Apr 6, 2023, 1:49 PM

#

I guess my point is that the Hilbert transform leads to a reasonable definition of an envelope in many cases of interest, and it can be used to give a definition of an envelope that applies more widely, though perhaps with some loss of intuition and application.

wooden sail Apr 6, 2023, 1:51 PM

#

absolutely. i was just addressing the question they asked about the upper and lower envelope being the same, not in general about analytic representations

#

and regarding the non-zero mean, i was handwavy there. i meant non-zero mean on the support of a function, and zero (not just the mean) everywhere else

#

then you get all the nice properties of square integrability to do the analysis in fourier domain

versed gulch Apr 6, 2023, 2:06 PM

#

How does data augmentation work in Pytorch where the model sees a new dataset for each epoch?

in my current workflow here is my code in general:

# my custom dataset class
Class Eye:
....
my_transforms = transforms.Compose([
  transforms.ToPILImage(),
  transforms.RandomRotation(degrees = 45),
  transforms.RandomHorizontalFlip(p = 0.5),
  transforms.RandomVerticalFlip(p = 0.5),
  transforms.ToTensor(), # default normalization to 0-1 range
  transforms.Normalize(mean = (0.5, ), std = (0.5, )) # greyscale images
])
# loading training dataset, val and loaders
train_dataset = Eye(image_dir = "...", 
    mask_dir = "...", 
   transform=my_transforms)

train_loader = DataLoader(
    dataset=train_dataset,
    batch_size=BS,
    shuffle=True,
    num_workers=2
)

train_losses, val_losses = [], []

for epoch in range(EPOCHS + 1):

    train_loss = train(model, train_loader, optimizer, loss_fn, device)
    val_loss = evaluate(model, val_loader, loss_fn, device)

mild dirge Apr 6, 2023, 2:11 PM

#

versed gulch How does data augmentation work in Pytorch where the model sees a new dataset fo...

What's the question?

#

The transform will be applied for each batch

#

No because the data is not stored. The data loader keeps loading in batches of images and then applying the transformation.

#

So each epoch will not have the same augmented images

versed gulch Apr 6, 2023, 2:14 PM

#

mild dirge So each epoch will not have the same augmented images

oh so this is done automatically for each epoch?

#

just confused that the transforms arguement on the dataset class is used before the epoch loop

mild dirge Apr 6, 2023, 2:16 PM

#

Well it's just a function that the dataloader uses to transform each batch it loads

sleek harbor Apr 6, 2023, 2:43 PM

#

Does sklearn.model_selection.cross_val_score shuffle before splitting? I thought it did, but.. just read the documentation, and it seems it doesn't. Why not? Wouldn't it make more sense if it did by default?

dense crane Apr 6, 2023, 3:05 PM

#

how hard would be to make a chess bot trained on some user who played around 20k blitz games and make it to play like he ( some opening, as close as possible style of managing his time while game and so on)

#

i am a little bit affraid that 20k is not enough data and after the opening he will starts to play a random moves just

rough lava Apr 6, 2023, 3:07 PM

#

sleek harbor Does `sklearn.model_selection.cross_val_score` shuffle before splitting? I thoug...

this splits and shuffles
xtrain, xtest, ytrain, ytest = train_test_split(df['feature_1'], df['feature_2'], shuffle=True, test_size=0.2)

#

you can look it up as well https://scikit-learn.org/0.20/modules/generated/sklearn.model_selection.train_test_split.html

sleek harbor Apr 6, 2023, 3:24 PM

#

rough lava this splits and shuffles xtrain, xtest, ytrain, ytest = train_test_split(df['fea...

Yeah, I know, but that doesn't perform cross validation.. I know how to cross validate with shuffling, that's not what I'm asking. I'm asking why cross_val_score doesn't scuffle by default.. is there a reason for that? Wouldn't it make more sense if it did shuffle?

mild dirge Apr 6, 2023, 3:32 PM

#

To make the results reproducible ig

rough lava Apr 6, 2023, 3:43 PM

#

sleek harbor Yeah, I know, but that doesn't perform cross validation.. I know how to cross va...

Does it not have split in here? lemon_thinking
cross_val_score(estimator=model, X=X, y=y, scoring='r2', cv=KFold(shuffle=True))
Documentation seems to indicate it is just false at start
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html
Or am I understanding the wrong thing again ? pithink

scikit-learn

sklearn.model_selection.cross_val_score

Examples using sklearn.model_selection.cross_val_score: Model selection with Probabilistic PCA and Factor Analysis (FA) Model selection with Probabilistic PCA and Factor Analysis (FA) Imputing miss...

#

I am really curious now
Also what type of model are you going for?

broken vortex Apr 6, 2023, 3:57 PM

#

Hi guys, for datascience class I have to create a KNN algorithm with good parameters so I made a simple implementation and I get on average 97.5% accuracy.
Is there a way to improve a simple implementation of KNN ?

#

Implementation with no libraries

cold osprey Apr 6, 2023, 4:03 PM

#

Hi, just curious if anyone uses some sort of experiment tracking with their personal projects?

#

e.g. mlflow, dvc

sleek harbor Apr 6, 2023, 4:09 PM

#

rough lava Does it not have split in here? <:lemon_thinking:754441881420562433> cross_val_...

Yeah, that's correct. What I'm wondering is: why is it false by default? Shuffle is true by default in train_test_split, so why isn't it true by default for cross_validation? I just want to know the reasoning behind that choice.
If I didn't notice that it's false by default, I probably would've gone using CV, assuming that it shuffles by default (because.. I think that makes more sense), which could lead to problems, for polynomial models, for example

untold cliff Apr 6, 2023, 4:15 PM

#

untold cliff Do you usually check for the p-value when checking for correlation ?

@queen cradle can you answer this question please? Because if we should do that then what would a good visualization of it be cause heatmaps wouldn't be appropriate i guess ?

mint palm Apr 6, 2023, 4:56 PM

#

indepth tutorial of Multiple instance learning(MIL)? all i find is high level idea, or research paper

hasty mountain Apr 6, 2023, 5:06 PM

#

arctic crown Wait so ever bode would be different? But who decides what function to put in th...

The activation function for each node is kind of a hyperparameter, but the number in the node that will multiply the input is simply the weight of that node. And this number is defined upon initialization and optimized through training

hasty mountain Apr 6, 2023, 5:15 PM

#

wooden sail the distance between two gaussian distributions boils down to something proporti...

Oh, and I just discovered that I was using a MSE Loss wrongly for my self-learning model.
The model outputs a probability distribution out of a softmax...and for measuring its consistency, I was applying MSE between 2 outputs of the same model(the model has dropout, so those outputs tend to be slightly different)

#

It seems KL-Divergence would be preferred, here? Or maybe cross entropy, since it's a classification task... pithink
I think I'll just test both

#

Though Pytorch's KLD seems a bit weird. I tend to get some problems with it...

wooden sail Apr 6, 2023, 5:20 PM

#

i think someone here mentioned that pytorch's KLD applies softmax to the input you pass to it to guarantee it behaves like a pdf

hasty mountain Apr 6, 2023, 5:22 PM

#

wooden sail i think someone here mentioned that pytorch's KLD applies softmax to the input y...

#

Oh, it's pointwise

#

pithink

#

Does it change anything?

#

The KL-Divergence I used for my Variational AutoEncoder seems closer to what I want:

def kl_divergence(z, mu, std):
    # Monte carlo KL divergence
    # 1. define the first two probabilities (in this case Normal for both)
    p = torch.distributions.Normal(torch.zeros_like(mu), torch.ones_like(std))
    q = torch.distributions.Normal(mu, std)

    # 2. get the probabilities from the equation
    log_qzx = q.log_prob(z)
    log_pz = p.log_prob(z)

    # kl
    kl = (log_qzx - log_pz)
    kl = kl.sum(-1)
    return kl

#

I remember that Pytorch's version didn't work well...but that could be just me doing something wrong...

wooden sail Apr 6, 2023, 5:24 PM

#

that looks like the same as pytorch does

#

ah you don't have the product in front though

wooden sail Apr 6, 2023, 5:25 PM

#

hasty mountain

the pointwise part is that it does not add/integrate over all observable events

hasty mountain Apr 6, 2023, 5:26 PM

#

Oh...

#

And that product...is it mandatory in the function?
Should I apply kl = log_pz * (log_qzx - log_pz)?

hasty mountain Apr 6, 2023, 5:27 PM

#

wooden sail that looks like the same as pytorch does

Indeed...but still, I remember having some problems with it

#

Well, I guess I'll just go for it. If it messes up, I'll try the VAE version.

misty flint Apr 6, 2023, 6:22 PM

#

interesting syllabus

#

https://stanford-cs324.github.io/winter2022/

CS324

Home

Understanding and developing large language models.

hasty mountain Apr 6, 2023, 6:47 PM

#

hasty mountain Well, I guess I'll just go for it. If it messes up, I'll try the VAE version.

grumpchib

sterile wyvern Apr 6, 2023, 6:57 PM

#

Bayesian clustering, is there such a thing?

hasty mountain Apr 6, 2023, 7:29 PM

#

hasty mountain The KL-Divergence I used for my Variational AutoEncoder seems closer to what I w...

Hm... Since I'm doing a classification task, using torch.distributions.Categorical seems to make more sense than using a Normal distribution.
Problem is... Pytorch doesn't have a rsample implemented for Categorical distribution.
Should my implementation be something like:

output = model(input)
output = torch.nn.functional.softmax(output, -1)
distribution = torch.distributions.Categorical(output)

rsample = output[:, -1] * distribution.sample()

?

#

The model output is supposed to be a probability distribution. And since it can be derived, then I suppose multiplying the distribution sample by the output argmax would allow for optimization, right?
Though it feels a bit redundant...I make a distribution out of my output, and then sample from this distribution using the same output... pithink

wooden sail Apr 6, 2023, 7:34 PM

#

you shouldn't need to use torch distributions categorical there

#

applying a softmax at the last layer of your network already turns the output into probabilities

hasty mountain Apr 6, 2023, 7:37 PM

#

So, the softmax already turns it into a distribution?
So...the KLD would be KLD = outputA - outputB directly?

wooden sail Apr 6, 2023, 7:37 PM

#

that depends on whether the kld will apply softmax again tbh, i don't recall if it does

wooden sail Apr 6, 2023, 7:38 PM

#

hasty mountain So, the softmax already turns it into a distribution? So...the KLD would be `KLD...

some logs are missing there

hasty mountain Apr 6, 2023, 7:38 PM

#

That's the thing, I want to use a custom way to apply KLD, since Pytorch seems to be returning NaN

wooden sail Apr 6, 2023, 7:38 PM

#

that's a good indicator you did something wrong 😛

hasty mountain Apr 6, 2023, 7:38 PM

#

yert

#

Oh...

dist = torch.distributions.Categorical(out)
print(out[0])
print(dist.probs[0])

tensor([0.2728, 0.0915, 0.0055, 0.0100, 0.0943, 0.1920, 0.0499, 0.0216, 0.0858,
        0.0016, 0.0006, 0.1745], device='cuda:0')
tensor([0.2728, 0.0915, 0.0055, 0.0100, 0.0943, 0.1920, 0.0499, 0.0216, 0.0858,
        0.0016, 0.0006, 0.1745], device='cuda:0')

#

~~So there's no magic?~~

hasty mountain Apr 6, 2023, 8:13 PM

#

wooden sail that's a good indicator you did something wrong 😛

Oh yes...log can't be a negative value... yert

#

So I should apply a ReLU to my output layer...I guess...or at least a modulus

stone marlin Apr 6, 2023, 8:23 PM

#

How do y'all productionize your models (at work or for personal projects)?

I've been in the habit of sticking the model in a docker container with a small API that has a "predict" endpoint. This has some limitations but works "okay" for models which aren't getting passed a ton of data and/or which aren't near-real-time.

(This is also nice because if we want to A/B test models, we can split the data beforehand and send it to different containers!)

raw compass Apr 6, 2023, 9:08 PM

#

this book: "Make Your Own Neural Network-2016", is still good?

cold osprey Apr 6, 2023, 9:54 PM

#

stone marlin How do y'all productionize your models (at work or for personal projects)? I'...

i dont work on ML at work but this is how intend to do it

#

for personal projects hosted on vercel, may just include the pickle version of model and directly use it within the app

#

not tested this tho but its what im currently working on

stone marlin Apr 6, 2023, 9:56 PM

#

That sounds pretty similar to what I'm doin' and seems pretty okay. Never used vercel but it looks like it would work fine like that!

spare plover Apr 7, 2023, 12:01 AM

#

Hey hello guys I need a help for writing the code for detecting iris and pupil using daugmans algorithm and converting that iris from polar or circular form into cartesian form..... if any ans pls dm me....

misty flint Apr 7, 2023, 12:09 AM

#

stone marlin How do y'all productionize your models (at work or for personal projects)? I'...

for one project we deployed it using aws lambda

#

their serverless service

#

which was fine for this use case since the cold start issue didnt matter as much

#

if you do this, i recommend using the cpu version of pytorch since it made a huuuge difference (aws lambda is cpu based)

#

not sure if youve seen this but i highly recommend the FSDL course https://fullstackdeeplearning.com/course/2022/lecture-5-deployment/

Full Stack Deep Learning - Lecture 5: Deployment

How to turn an ML model into an ML-powered product

cold osprey Apr 7, 2023, 12:20 AM

#

misty flint for one project we deployed it using aws lambda

this essentially makes it a (micro?)service?

#

that can be used by anything that has access to it

misty flint Apr 7, 2023, 12:22 AM

#

im trying to send 3 dif approaches but its keeps saying my files are queued

#

🕯️

#

tragic

#

CL5_FeelsBongoMan

#

broken rip

#

misty flint Apr 7, 2023, 12:30 AM

#

misty flint not sure if youve seen this but i highly recommend the FSDL course https://fulls...

from the lecture

misty flint Apr 7, 2023, 12:31 AM

#

misty flint

@cold osprey deploying it along with vercel in the web app server is the first approach i believe

#

the lambda approach would be the 3rd one

#

they call it model-as-a-service but its essentially a microservice

#

they each have their pros and cons. just be careful when scaling

#

(if applicable)

ocean swallow Apr 7, 2023, 12:36 AM

#

has there been any good models for production that can tag products with related labels?

#

I asked people at vue.ai and they asked me $40k dollars lol

#

Google Vision API is terrible too ...

cold osprey Apr 7, 2023, 12:43 AM

#

misty flint <@342346882800025600> deploying it along with vercel in the web app server is th...

nice thanks, i think 1st one makes more sense for a simple app kinda use case where the model inference time is not too long/computationally expensive

#

2nd one more for when the predictions are to be displayed in a dashboard too hence batch processing

#

3rd one is kinda for more live stuff?

#

real time i mean

misty flint Apr 7, 2023, 12:44 AM

#

cold osprey 2nd one more for when the predictions are to be displayed in a dashboard too hen...

its good for recommendation use cases

ocean swallow Apr 7, 2023, 12:45 AM

#

#

I guess I shouldn't even bother doing this myself since Amazon is not able to LOL

misty flint Apr 7, 2023, 12:45 AM

#

where you can do intense computations like neural collaborative filtering, etc. in a batch process (since they would take too long for real time inference), run them daily or weekly or etc. then load them up

#

when you need them

misty flint Apr 7, 2023, 12:47 AM

#

cold osprey nice thanks, i think 1st one makes more sense for a simple app kinda use case wh...

but yeah this works for stuff like personal websites for sure

#

no probs

#

btw

cold osprey Apr 7, 2023, 12:49 AM

#

misty flint where you can do intense computations like neural collaborative filtering, etc. ...

i wonder if netflix's is batch or

queen cradle Apr 7, 2023, 12:50 AM

#

untold cliff <@710929945526009897> can you answer this question please? Because if we should ...

Like so many things, the answer is "it depends". Usually, the null hypothesis of such a test is that the correlation coefficient is zero. If you reject the null hypothesis, then you've found some kind of correlation in the data. If your test was based on Pearson's 𝜌, then you found evidence that the true value of 𝜌 is non-zero, so the covariance is non-zero; if your test was based on Kendall's 𝜏, then you found evidence that the true value of 𝜏 is non-zero, so the variables have some kind of order relationship; and so on.

Rejecting the null hypothesis implies that the two variables are not independent. This may be that your goal; it lets you go to your colleagues and say, "There's a relationship!" I have done this before—I was able to say, "These things you thought were probably correlated are provably correlated," and people liked that. But failing to reject the null hypothesis does not prove that the variables are independent. No test for correlation can prove independence. If you want to be confident that the rest of your analysis is sound, then you can't assume independence unless you have domain-specific knowledge that tells you the events ought to be independent. For that reason, I think hypothesis tests for correlation coefficients have limited utility. I think they're used more often than they should be.

misty flint Apr 7, 2023, 12:53 AM

#

cold osprey i wonder if netflix's is batch or

batch would be offline in here. netflix also uses nearline + online. this article is from 2013. im sure theyve moved towards more of a streaming first/online infrastructure since then https://netflixtechblog.com/system-architectures-for-personalization-and-recommendation-e081aa94b5d8

#

cold osprey Apr 7, 2023, 1:00 AM

#

nice

misty flint Apr 7, 2023, 1:01 AM

#

nod

fiery jungle Apr 7, 2023, 2:40 AM

#

hi ,
im trying to self learn AI, working on understanding how to make a line that eventually converges the loss curve.
i find it a lot easier to use something like this which is part of the course

I wonder if there is a software or a way to visualize our output better than the matplotlib

#

while matplotlib gives the final output for the whole process the output from that course was updated after each Epochs, i feel more comfortable with that kind of interface if there's one, if not , its fine . but just ler me know if u know some application/interface that can visualize the graph better

cold osprey Apr 7, 2023, 3:33 AM

#

fiery jungle while `matplotlib` gives the final output for the whole process the output from ...

U can track the loss/accuracy etc per epoch

#

Not sure e if that's whats ure asking

thorn swift Apr 7, 2023, 3:43 AM

#

Bro I’m trying to figure out networkx right now too

sterile wyvern Apr 7, 2023, 3:49 AM

#

Can Bayesian optimization pick the least correlated parameter sets from an insample test?

lapis sequoia Apr 7, 2023, 6:47 AM

#

Hi, I need help with building OpenCV from source. Why? I got this error while trying to do cv2.imshow("image", image) in a project that then recommended building from source.
The error:

I tried to install the required packages using sudo apt-get update && sudo apt-get install libpangoft2-1.0-0 libtiff5 but they're already in latest version.
Note: I'm using Ubuntu 22.04

dense gulch Apr 7, 2023, 6:49 AM

#

Hello guys.....need help in capitalizing duplicate letters in two given strings in python. For example in a string 'computer program' we have duplicates or repeated letters 'OMPR' ....I want the output as cOMPuteR PROgRaM
I am able to grab duplicates from the given string but couldn't understand how to capitalize them in the main string

untold cliff Apr 7, 2023, 10:24 AM

#

queen cradle Like so many things, the answer is "it depends". Usually, the null hypothesis of...

So you're implying that they should be used when we want to prove dependance? Or it isn't worth it either?

untold cliff Apr 7, 2023, 10:50 AM

#

dense gulch Hello guys.....need help in capitalizing duplicate letters in two given strings ...

There's probably a better approach but this should work: https://pastebin.com/iY72RVN3
Also, i dont think this is the correct channel for this question.

Pastebin

Duplicates_to_upper - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

dense gulch Apr 7, 2023, 11:48 AM

#

untold cliff There's probably a better approach but this should work: https://pastebin.com/iY...

Sorry..... Which is the correct channel for such questions on python?

#

And thanks for the solution

untold cliff Apr 7, 2023, 11:58 AM

#

dense gulch Sorry..... Which is the correct channel for such questions on python?

The help channel is most suitable i believe.

versed gulch Apr 7, 2023, 12:46 PM

#

does anyone know how to implement data augmentation techniques for 3D images in Pytorch?

class Eye(Dataset):
  def __init__(self, image_dir, mask_dir, transform = None):
    self.image_dir = image_dir
    self.mask_dir = mask_dir
    self.transform = transform
    self.images = sorted(os.listdir(image_dir))
    self.masks = sorted(os.listdir(mask_dir))
    self.images.sort(key=lambda f: int(''.join(filter(str.isdigit, f))))
    self.masks.sort(key=lambda f: int(''.join(filter(str.isdigit, f))))
    
  def __len__(self):
    if len(self.images) == len(self.masks):
      return len(self.images)
  
  def __getitem__(self, index):
    img_path = os.path.join(self.image_dir, self.images[index])
    mask_path = os.path.join(self.mask_dir, self.masks[index])
    image = io.imread(img_path)
    image = image.astype(np.float32)
    
    mask = io.imread(mask_path) 
    mask = mask.astype(np.float32)
    
    if self.transform:
      image = self.transform(image)
      mask = self.transform(mask)
    
    return image, mask
my_transforms = transforms.Compose([
  transforms.ToPILImage(),
  transforms.RandomRotation(degrees = 45),
  transforms.RandomHorizontalFlip(p = 0.5),
  transforms.RandomVerticalFlip(p = 0.5),
  transforms.ToTensor(),
  transforms.Normalize(mean = (0.5, ), std = (0.5, ))
])

train_dataset_DA = Eye(image_dir = image_path, mask_dir = mask_path, transform=my_transforms)

train_loader = ...

for x, y in train_loader:
  print(y.shape)

#

I'm getting this error for the for loop

#

my dataset consists of 3D greyscale images of shape 32x256x256 (Depth x Height x Width)

untold cliff Apr 7, 2023, 1:12 PM

#

versed gulch does anyone know how to implement data augmentation techniques for 3D images in ...

I don't know how to do that but based on the error message maybe you could try passing your image as 32 images each of shape 256*256 and then you combine them back together. Not sure if this would give you the desired result though.

tidal bough Apr 7, 2023, 1:18 PM

#

versed gulch does anyone know how to implement data augmentation techniques for 3D images in ...

PIL doesn't support it, so I guess you'd have to avoid using ToPILImage somehow.

versed gulch Apr 7, 2023, 1:19 PM

#

tidal bough PIL doesn't support it, so I guess you'd have to avoid using ToPILImage somehow.

tried this but get this error message instead

#

it doesnt work with numy arrays

tidal bough Apr 7, 2023, 1:20 PM

#

Yeah, it won't work directly with the image transform functions then

#

actually.. it's not clear to me what function fails here

versed gulch Apr 7, 2023, 1:22 PM

#

tidal bough actually.. it's not clear to me what function fails here

I think its the next_data function

tidal bough Apr 7, 2023, 1:22 PM

#

Try dropping ToTensor too, perhaps.

versed gulch Apr 7, 2023, 1:23 PM

#

I get the dimensions but same error

tidal bough Apr 7, 2023, 1:23 PM

#

because the thing is, the three image transforms you're using claim to work fine on tensors as well

versed gulch Apr 7, 2023, 1:24 PM

#

it now works for some reason when I put ToTensor() first

tidal bough Apr 7, 2023, 1:25 PM

#

ah, that makes sense, if these weren't tensors already then the image transforms might not work (the docs say they are for Pillow images or torch tensors)

versed gulch Apr 7, 2023, 1:25 PM

#

my_transforms = transforms.Compose([
  transforms.ToTensor(), # default normalization to 0-1 range
  transforms.RandomRotation(degrees = 45),
  transforms.RandomHorizontalFlip(p = 0.5),
  transforms.RandomVerticalFlip(p = 0.5),
  transforms.Normalize(mean = (0.5, ), std = (0.5, )) # greyscale images
])

#

so weird

queen cradle Apr 7, 2023, 1:35 PM

#

untold cliff So you're implying that they should be used when we want to prove dependance? Or...

If you want to prove dependence, then yes, these kinds of hypothesis tests are useful. As I said yesterday, I've used this kind of test for this purpose: There was an observation about our data that seemed intuitively plausible, and that people believed was true based on experience, but that people weren't sure how to prove statistically. We could have gotten along without proving it; but it made everyone more comfortable to be working with facts instead of feelings.

untold cliff Apr 7, 2023, 1:39 PM

#

queen cradle If you want to prove dependence, then yes, these kinds of hypothesis tests are u...

Thanks a lot. One last, somewhat unrelated, question. Do you know why we divide by the geometric mean in kendall's tau-b formula ?

fiery jungle Apr 7, 2023, 1:45 PM

#

cold osprey U can track the loss/accuracy etc per epoch

really ? how ?

queen cradle Apr 7, 2023, 1:47 PM

#

untold cliff Thanks a lot. One last, somewhat unrelated, question. Do you know why we divide ...

Off the top of my head, I don't remember. It's probably in Pratt and Gibbons, Concepts of Nonparametric Theory, but I don't have time to look it up just now.

untold cliff Apr 7, 2023, 1:49 PM

#

queen cradle Off the top of my head, I don't remember. It's probably in Pratt and Gibbons, _C...

Ok i'll have a look. Thanks a lot!

dusky cargo Apr 7, 2023, 1:50 PM

#

im trying to plot 6 things onto a figure using matplotlib

#Creating a matplotlib figure so all results displayed on one figure 
fig = plt.figure(figsize=(11,9))
fig.set_tight_layout(True)

print('Plotting Chart')
#Positive Collocation table
plt.subplot(2, 3, 1)
plot_table(positive_reviews_with_collocations_sorted[:40], f"Frequency of co-occuring words with POS tags of {bigram_postags} in Positive Reviews")
plt.subplot(2, 3, 2)
plot_table(positive_reviews_with_collocations_no_pos_sorted[:40], "Frequency of co-occuring words without POS tags in Positive Reviews")
#Negative Collocation table
plt.subplot(2, 3, 3)
plot_table(negative_reviews_with_collocations_sorted[:40], f"Frequency of co-occuring words with POS tags of {bigram_postags} in Negative Reviews")
plt.subplot(2, 3, 4)
plot_table(negative_reviews_with_collocations_no_pos_sorted[:40], "Frequency of co-occuring words without POS tags in Negative Reviews")

this has worked for me in the past but i havent used tables before, seems to not work now. is there another solution?

quaint loom Apr 7, 2023, 2:04 PM

#

Does anyone know how to solve this issue?

untold cliff Apr 7, 2023, 2:07 PM

#

quaint loom Does anyone know how to solve this issue?

You should use \ or / instead.

quaint loom Apr 7, 2023, 2:08 PM

#

untold cliff You should use \\ or / instead.

Thank you for this. I just copyed from the properties description.

quaint loom Apr 7, 2023, 2:09 PM

#

untold cliff You should use \\ or / instead.

Instead of what you mean? I am using \

untold cliff Apr 7, 2023, 2:09 PM

#

quaint loom Thank you for this. I just copyed from the properties description.

I dont why it was rendered that way but i meant 2 of this \ or 1 of this /

#

A single \ is used to escape characters like when you do "\n" for a newline. You should use 2 of it in order to escape it as well.

quaint loom Apr 7, 2023, 2:11 PM

#

untold cliff I dont why it was rendered that way but i meant 2 of this \ or 1 of this /

When using two \ I got PermissionError: [Errno 13]

untold cliff Apr 7, 2023, 2:12 PM

#

It means you dont have permission to access that file.

quaint loom Apr 7, 2023, 2:14 PM

#

ehm. I can open the excel file. Maybe change the location of the file?

untold cliff Apr 7, 2023, 2:14 PM

#

quaint loom ehm. I can open the excel file. Maybe change the location of the file?

I think if the file is already open in excel then you can open it from somewhere else.

raw compass Apr 7, 2023, 2:15 PM

#

what is the best way to work with huge-data sets? I have a data-set(12gb), I think its not the best way to download that locally and upload that to github.

quaint loom Apr 7, 2023, 2:15 PM

#

untold cliff I think if the file is already open in excel then you can open it from somewhere...

At this moment, the excel file is not open but I can open it and have access to it.

quaint loom Apr 7, 2023, 2:18 PM

#

raw compass what is the best way to work with huge-data sets? I have a data-set(12gb), I thi...

When working with large datasets in Python, it's best to use libraries like Dask or Apache Spark, which allow you to process the data in a distributed manner across multiple machines

raw compass Apr 7, 2023, 2:18 PM

#

quaint loom When working with large datasets in Python, it's best to use libraries like Dask...

so like I can access to whole thing, in a server?

untold cliff Apr 7, 2023, 2:19 PM

#

quaint loom At this moment, the excel file is not open but I can open it and have access to ...

Are you sure that's the correct path. It doesnt have any extension at the end.

quaint loom Apr 7, 2023, 2:21 PM

#

untold cliff Are you sure that's the correct path. It doesnt have any extension at the end.

You`re saying something there. I am copying from this, but as you mention there is no extension at the end

untold cliff Apr 7, 2023, 2:22 PM

#

quaint loom You`re saying something there. I am copying from this, but as you mention there ...

Yeah apparently that's just the directory location. You add the filename at the end and use read_csv instead

quaint loom Apr 7, 2023, 2:24 PM

#

raw compass so like I can access to whole thing, in a server?

What should I put between the last part and the filename?

quaint loom Apr 7, 2023, 2:24 PM

#

raw compass so like I can access to whole thing, in a server?

Yes

untold cliff Apr 7, 2023, 2:25 PM

#

quaint loom What should I put between the last part and the filename?

2 of this \ like before

quaint loom Apr 7, 2023, 2:25 PM

#

data = pd.read_csv?

untold cliff Apr 7, 2023, 2:25 PM

#

Yeah

quaint loom Apr 7, 2023, 2:25 PM

#

untold cliff 2 of this \ like before

Thank you.

#

Appreciate your patient

#

@untold cliff You woulnd`t know why it doesnt come in that other format in the properties?

untold cliff Apr 7, 2023, 2:28 PM

#

quaint loom <@964482481149603870> You woulnd`t know why it doesnt come in that other format ...

Why it doesnt include the filename ?

quaint loom Apr 7, 2023, 2:29 PM

#

untold cliff Why it doesnt include the filename ?

Yes and double \

robust cliff Apr 7, 2023, 2:31 PM

#

guys how can I get rid of floating point imprecisions while calculating a polynomial's roots? I tried using the decimal dataclass but that didn't work

p = np.array([1, -5, 8, -4])
p_dec = [Decimal(str(coeff)) for coeff in p]
roots_dec = np.roots(p_dec)
print(roots_dec)

[2.00000006 1.99999994 1. ] <- it goes like this

untold cliff Apr 7, 2023, 2:32 PM

#

quaint loom Yes and double \\

No. I think the \ is windows specific though, dont really know why. And for not including the filename it might be somewhere in your settings to include the directory only

untold cliff Apr 7, 2023, 2:36 PM

#

robust cliff guys how can I get rid of floating point imprecisions while calculating a polyno...

It's due to the implementation i guess but you can round the results to a suitable precision and it would be fine

robust cliff Apr 7, 2023, 2:42 PM

#

alright thanks, I'll do that then

untold cliff Apr 7, 2023, 2:51 PM

#

robust cliff alright thanks, I'll do that then

If the 1.99... should be 2 then it wouldnt work. Maybe you should set a tolerance like 1e-5 and check if the difference between the results and their rounded values is less than that tolerance. And then plug the roots back to the polynomial just to check

robust cliff Apr 7, 2023, 2:52 PM

#

I just made the second part where I plug the values back in

wooden sail Apr 7, 2023, 2:52 PM

#

finding roots of polynomials is a very nasty computational task. it's an ill conditioned problem

robust cliff Apr 7, 2023, 2:53 PM

#

what method does numpy use?

wooden sail Apr 7, 2023, 2:53 PM

#

probably newton methods

#

you'd have to make your own implementation based on the decimal library where you take numpy's solution and take a few newton steps yourself to get better precision

#

that still won't be exact

robust cliff Apr 7, 2023, 2:54 PM

#

sounds like a lot of work

wooden sail Apr 7, 2023, 2:55 PM

#

not terribly, but also it's only worth it in some cases. what are the roots being used for?

robust cliff Apr 7, 2023, 2:55 PM

#

factorization

wooden sail Apr 7, 2023, 2:56 PM

#

you'll be better off using CAS for that

robust cliff Apr 7, 2023, 2:56 PM

#

why's that?

wooden sail Apr 7, 2023, 2:58 PM

#

because you won't be able to find the exact roots using numerics except for very special cases

#

especially considering you can only represent rational numbers with a computer

subtle mural Apr 7, 2023, 3:01 PM

#

Does anyone here have experience with using DeepSpeed or colossal AI?

#

Am i able to use these to reduce the VRAM requirements of my models by offloading them to NVME or RAM?

#

Is there any specific limitations to model architectures e.g transformers only, or can i use it for all model architectures?

quaint loom Apr 7, 2023, 3:14 PM

#

How do I modify this code in order to remove the first column in the excel file that include the year?

subtle mural Apr 7, 2023, 3:24 PM

#

try removing the yrea from your dataframe?

quaint loom Apr 7, 2023, 3:26 PM

#

subtle mural try removing the yrea from your dataframe?

Explain please

subtle mural Apr 7, 2023, 3:28 PM

#

quaint loom Explain please

remove the values for years in your data, or just unselect them before plotting

#

im assuming you don't want to plot the year in the graph?

visual violet Apr 7, 2023, 3:31 PM

#

hi

#

i am trying to understand the shapes of this graph

Screenshot_2023-04-07_at_11.31.29_AM.jpg

#

at the very bottom, i don't know where the 20 and the 650 comes from

echo canyon Apr 7, 2023, 3:31 PM

#

hello guys could you help outa fellow

serene scaffold Apr 7, 2023, 3:32 PM

#

echo canyon hello guys could you help outa fellow

You have to ask a question

echo canyon Apr 7, 2023, 3:33 PM

#

why my validation accuracy is low (50 percent) while training accuracy is over 95 percent?

subtle mural Apr 7, 2023, 3:33 PM

#

visual violet at the very bottom, i don't know where the 20 and the 650 comes from

you need to look at wherever you got the image from

echo canyon Apr 7, 2023, 3:33 PM

#

I want to train a Neural Network using EfficientNetB2 and while the training phase is going well the validation is always on 50 percent. This is my code :

subtle mural Apr 7, 2023, 3:34 PM

#

It's jsut the shape of the input to the LSTM

echo canyon Apr 7, 2023, 3:34 PM

#

import numpy as np
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator

from google.colab import drive
drive.mount('/content/gdrive')

!unzip /content/gdrive/MyDrive/chest-xray-pneumonia.zip

img = tf.keras.utils.image_dataset_from_directory(directory='/content/chest_xray/train')
img_val = tf.keras.utils.image_dataset_from_directory(directory='/content/chest_xray/val')
img_test = tf.keras.utils.image_dataset_from_directory(directory='/content/chest_xray/test')

subtle mural Apr 7, 2023, 3:34 PM

#

just co confirm, by val accuracy, you mean validation and NOT test?

echo canyon Apr 7, 2023, 3:34 PM

#

Data augmentation

IMG_SIZE = 128
data_augmentation = tf.keras.Sequential([
tf.keras.layers.Resizing(IMG_SIZE, IMG_SIZE),
tf.keras.layers.Rescaling(1./255),
tf.keras.layers.RandomTranslation(0.2,0.2),
tf.keras.layers.RandomRotation(0.3),
tf.keras.layers.RandomFlip("horizontal"),
tf.keras.layers.RandomZoom(0.2)
])

data_resize = tf.keras.Sequential([
tf.keras.layers.Resizing(IMG_SIZE, IMG_SIZE),
tf.keras.layers.Rescaling(1./255)
])

img_aug = img.map(
lambda x, y: (data_augmentation(x, training=True), y))

img_val_aug = img_val.map(
lambda x, y: (data_resize(x, training=True), y))
img_test_aug = img_test.map(
lambda x, y: (data_resize(x, training=True), y))
img_aug.shuffle(100)

#

model structure

base_model = tf.keras.applications.EfficientNetB2(input_shape=(128,128,3),
include_top=False,
weights='imagenet')
base_model.trainable = True

my_model = tf.keras.models.Sequential([
base_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(128 , activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(64 , activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(1 , activation='sigmoid')
])

my_model.summary()

#

Balancing Labels

normal_count = 1341
pneumonia_count = 3875
total_count = normal_count + pneumonia_count

normal_weight = (1 / normal_count) * (total_count) / 2.0
pneumonia_weight = (1 / pneumonia_count) * (total_count) / 2.0

class_weight = {0: normal_weight,
1: pneumonia_weight}

#

Compile and train

base_learning_rate = 0.001

my_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=base_learning_rate),
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=['accuracy'])

history = my_model.fit(img_aug,
epochs=5,
batch_size = 128,
class_weight=class_weight)

#

I tried different approach with importing the dataset but the result is always the same. according to this article Automated Diagnosis of Pneumonia from Classification of Chest X-Ray Im ages using EfficientNet I should get more than 90 percent of accuracy for validation and test sets.

#

https://www.researchgate.net/publication/351643298_Automated_Diagnosis_of_Pneumonia_from_Classification_of_Chest_X-Ray_Im_ages_using_EfficientNet

ResearchGate

(PDF) Automated Diagnosis of Pneumonia from Classification of Chest...

PDF | Pneumonia is a fatal contagious agent that causes respiratory disorders. The methodology utilized by an advisor to evaluate pneumonia through... | Find, read and cite all the research you need on ResearchGate

#

i didnt get any comments on stack

subtle mural Apr 7, 2023, 3:38 PM

#

im assuming your code isn't from a linked github from the paper?

visual violet Apr 7, 2023, 3:39 PM

#

subtle mural you need to look at wherever you got the image from

Can you explain what the 3 numbers mean

subtle mural Apr 7, 2023, 3:43 PM

#

@visual violet Looking at the image, im guessing the 650 is the size of the hidedn layer, the 35 is the number of neurons,and the 20 is the batch size..?

#

i could be wrong though, been a long time since i last did anything with LSTMs

smoky epoch Apr 7, 2023, 4:02 PM

#

i have some data that ive cleaned (just removed and reaplced null values) but i wanna ask how do u know which columns need normalizing or is it good practice to always normalise?

quaint loom Apr 7, 2023, 4:05 PM

#

subtle mural remove the values for years in your data, or just unselect them before plotting

How would I unselect them before plotting? I may want to use the years later and kinda pitty to remove them now

visual violet Apr 7, 2023, 4:12 PM

#

subtle mural <@354372432838000642> Looking at the image, im guessing the 650 is the size of ...

That sounds good. Thank you man

#

Also can you explain what hidden size means ?

#

Like the number of outputs ?

quaint loom Apr 7, 2023, 4:29 PM

#

quaint loom How do I modify this code in order to remove the first column in the excel file ...

@wooden sail @subtle mural How do I unselect them before plotting this?

wooden sail Apr 7, 2023, 4:30 PM

#

maybe just if country == "Year": continue

serene scaffold Apr 7, 2023, 4:31 PM

#

quaint loom <@467435887236612106> <@316596029703323648> How do I unselect them before plotti...

Please don't ping frequent answerers unless they've already started answering your specific question.

quaint loom Apr 7, 2023, 4:33 PM

#

serene scaffold Please don't ping frequent answerers unless they've already started answering yo...

Okey. I am familiar that Edd is a genius so, I know he may know. Sorry for that.

#

What is this webpage that I can write the code and share it here?

dusky cargo Apr 7, 2023, 4:38 PM

#

correspondingNegativeReviewsQuantity.append(len(reviews_with_sentiment.loc[(reviews_with_sentiment['sentiment'] == 'Negative') & (reviews_with_sentiment['id'] == i)]))
how do i make things more readable cos this is getting silly XD

quaint loom Apr 7, 2023, 4:39 PM

#

dusky cargo `correspondingNegativeReviewsQuantity.append(len(reviews_with_sentiment.loc[(rev...

You can break down the code into multiple lines using line continuation with the backslash \ symbol.
example:

correspondingNegativeReviewsQuantity.append(
len(
reviews_with_sentiment.loc[ (reviews_with_sentiment['sentiment'] == 'Negative') & (reviews_with_sentiment['id'] == i)
]
)
)

mild dirge Apr 7, 2023, 4:39 PM

#

Or split it up into multiple lines

#

cond1 = reviews_with_sentiment['sentiment'] == 'Negative'
cond2 = reviews_with_sentiment['id'] == i
correspondingNegativeReviewsQuantity.append(len(reviews_with_sentiment.loc[cond1 & cond2]))

quaint loom Apr 7, 2023, 4:43 PM

#

wooden sail maybe just if country == "Year": continue

https://paste.pythondiscord.com/derupaxizo

red rose Apr 7, 2023, 4:48 PM

#

hi! i'm trying to apply the function mapPartitions to a file with 111k rows in Google Colab but it got stuck there. Does anyone know how could i fix it? thanks in advance.

#

quaint loom Apr 7, 2023, 4:54 PM

#

red rose

Try https://paste.pythondiscord.com/isonozepiy
Optimize your code by using efficient algorithms and data structures. If your code still runs slowly, consider breaking up the tasks into smaller, more manageable chunks.

red rose Apr 7, 2023, 4:55 PM

#

quaint loom Try https://paste.pythondiscord.com/isonozepiy Optimize your code by using effic...

how many partitions should i set for a file with 111k rows?

quaint loom Apr 7, 2023, 4:55 PM

#

red rose hi! i'm trying to apply the function mapPartitions to a file with 111k rows in G...

If the data is not evenly distributed across the partitions, some partitions may take longer to process than others, leading to performance issues

quaint loom Apr 7, 2023, 4:58 PM

#

red rose how many partitions should i set for a file with 111k rows?

It depends on several factors such as the size of your cluster, the available resources, and the nature of your data and processing tasks. A good starting point is to set the number of partitions to be a multiple of the number of cores in your cluster. For example, if you have a 4-core cluster, you can set the number of partitions to 4, 8, 12, or any other multiple of 4.

red rose Apr 7, 2023, 5:02 PM

#

quaint loom It depends on several factors such as the size of your cluster, the available re...

okay tx! i think google colab has only 2 cores

#

i'll try w 2

plain jungle Apr 7, 2023, 5:10 PM

#

So I want to make a layer of abstraction using a NLP for my database. Is there any good resources? This is what I currently have

#

(Posted also in #databases )

serene scaffold Apr 7, 2023, 5:16 PM

#

plain jungle So I want to make a layer of abstraction using a NLP for my database. Is there a...

NLP is a concept. Programs that apply NLP are not called "an NLP".

You'll want to look into intent classification. You need a model that can take a user's command as text, and identify which in a specific set of interaction types the user wants to do. And which words are important for that specific interaction. So if you ask "what is John's phone number", the model needs to classify this as REQUEST_PHONE_NO and JOHN as the entity of interest.

plain jungle Apr 7, 2023, 5:17 PM

#

Right, so currently I am using the spacy library and their NLP command does that (kinda) I wasn’t sure if there was a better resource to investigate on top of that

serene scaffold Apr 7, 2023, 5:19 PM

#

What is "their NLP command"?

plain jungle Apr 7, 2023, 5:21 PM

#

import spacy

# Load English tokenizer, tagger, parser and NER
nlp = spacy.load("en_core_web_sm")

doc = nlp("Mr. Best flew to New York on Saturday morning.")
ents = list(doc.ents)
assert ents[0].label_ == "PERSON"
assert ents[0].text == "Mr. Best"

serene scaffold Apr 7, 2023, 5:23 PM

#

plain jungle ```py import spacy # Load English tokenizer, tagger, parser and NER nlp = spacy...

That's not "an NLP command". nlp is just the variable for the English model instance.

plain jungle Apr 7, 2023, 5:23 PM

#

ah understand, sorry

#

currently I have my code setup to use the

doc.noun_chunks

to best match what item the person is looking for in the database

serene scaffold Apr 7, 2023, 5:25 PM

#

Anyway, you can use spacys entity recognition capabilities to identify relevant parts of the user input. But you still need to figure out what kind of information the user is asking for.

plain jungle Apr 7, 2023, 5:26 PM

#

are there any librarys which help with that, or does that need to be coded on my own?

serene scaffold Apr 7, 2023, 5:28 PM

#

plain jungle are there any librarys which help with that, or does that need to be coded on my...

Learning the basics of spacy is probably ambitious enough for this project, so I would probably just use conditional logic

#

Because you wouldn't be able to train an intent classifier without learning the basics of model training. Which is an entirely separate concern from what library you use

plain jungle Apr 7, 2023, 5:30 PM

#

okay! Thank you for the help! I appreciate it

untold cliff Apr 7, 2023, 5:55 PM

#

quaint loom You can break down the code into multiple lines using line continuation with the...

You can put it inside parentheses instead of using \ cause anything in parentheses counts as a single expression i believe

waxen iron Apr 7, 2023, 6:07 PM

#

are there automated scripts for finding white space areas in pictures?

fiery jungle Apr 7, 2023, 6:08 PM

#

hey,
i am self learning , i wanna ask a question about feature column

feature_columns = []

latitude = tf.feature_column.numeric_column("latitude")
feature_columns.append(latitude)

longitude = tf.feature_column.numeric_column("longitude")
feature_columns.append(longitude)

fp_feature_layer = layers.DenseFeatures(feature_columns) // ?????????????????????

def create_model(my_learning_rate, feature_layer):
  """Create and compile a simple linear regression model."""
  # Most simple tf.keras models are sequential.
  model = tf.keras.models.Sequential()

  model.add(feature_layer) // ???????????????????????????????????????? (2)

  model.add(tf.keras.layers.Dense(units=1, input_shape=(1,))) // ???????????????????????????????? (3)

  model.compile(optimizer=tf.keras.optimizers.experimental.RMSprop(learning_rate=my_learning_rate),
                loss="mean_squared_error",
                metrics=[tf.keras.metrics.RootMeanSquaredError()])

  return model

is there is a reason for adding fp_feature_layer alone to the model ?
I thought all dense layers are added on (3) why did we add feature_layer alone???

#

NVM , chat gpt got it LOL

#

mind blowing

iron basalt Apr 7, 2023, 7:23 PM

#

robust cliff why's that?

https://en.wikipedia.org/wiki/Polynomial_root-finding_algorithms#Square-free_factorization

Polynomial root-finding algorithms

Finding polynomial roots is a long-standing problem that has been the object of much research throughout history. A testament to this is that up until the 19th century, algebra meant essentially theory of polynomial equations.

#

https://en.wikipedia.org/wiki/Square-free_polynomial#Yun's_algorithm

Square-free polynomial

In mathematics, a square-free polynomial is a polynomial defined over a field (or more generally, an integral domain) that does not have as a divisor any square of a non-constant polynomial. A univariate polynomial is square free if and only if it has no multiple root in an algebraically closed field containing its coefficients. This motivates t...

robust cliff Apr 7, 2023, 7:27 PM

#

hey thanks for the reply, I switched to sympy in the end

iron basalt Apr 7, 2023, 7:27 PM

#

robust cliff hey thanks for the reply, I switched to sympy in the end

Sympy is a good choice.

#

It does these kinds of algorithms.

robust cliff Apr 7, 2023, 7:28 PM

#

now I'm kind of figuring out how to factorize properly while dealing with polys being strings

queen cradle Apr 7, 2023, 7:36 PM

#

untold cliff Ok i'll have a look. Thanks a lot!

I had a chance to look. It's not in Pratt and Gibbons, but actually there's an explanation on Wikipedia under https://en.wikipedia.org/wiki/Rank_correlation#General_correlation_coefficient. The point is that if you look at it the right way, correlation coefficients are the cosine of an angle. The denominator is actually the usual normalization by the length of a vector (in this case, the lengths are really the Frobenius norm of certain matrices).

untold cliff Apr 7, 2023, 7:39 PM

#

queen cradle I had a chance to look. It's not in Pratt and Gibbons, but actually there's an e...

Thanks for the help!

nocturne eagle Apr 7, 2023, 8:02 PM

#

queen cradle I had a chance to look. It's not in Pratt and Gibbons, but actually there's an e...

oh god, I remember a huge debate about using kendall vs spearman correlation on a project I was on

#

worse, it the people kept bringing it up again and again

fallow venture Apr 7, 2023, 8:19 PM

#

Good afternoon

untold cliff Apr 7, 2023, 8:20 PM

#

Maybe this if you dont care about order: py from itertools import count x = [ 0, 0, 0, 2, 50, 50, 80, 99, 998, 998, 998 ] mapping = { element: code for element, code in zip(set(x), count()) } unique = [mapping[i] for i in x] print(unique)

fallow venture Apr 7, 2023, 8:21 PM

#

Can someone help me?
I can't read a huge csv. Not even filtering only 2 columns

untold cliff Apr 7, 2023, 8:25 PM

#

I dont know about pytorch. How about numpy ?

#

np.unique has a return_index argument, if set to true, it would return the first index of each unique element which would be fine i guess

untold cliff Apr 7, 2023, 8:27 PM

#

fallow venture Can someone help me? I can't read a huge csv. Not even filtering only 2 columns

What is the problem exactly ?

red rose Apr 7, 2023, 8:29 PM

#

anyone know how to sort by length and then between the elements with same length sort by x[1]?

next valley Apr 7, 2023, 8:30 PM

#

Slight rant so sorry and yes i want answers but why do people try to modify a deep learning model then ask the most rudimentary question possible i.e. "how to increase context window from 2048 to large number"

untold cliff Apr 7, 2023, 8:31 PM

#

untold cliff np.unique has a return_index argument, if set to true, it would return the first...

@charred egret sorry, return_inverse is more appropriate i believe

#

Good luck 😀

untold cliff Apr 7, 2023, 8:34 PM

#

red rose anyone know how to sort by length and then between the elements with same length...

(len(x), x[1]) ?

red rose Apr 7, 2023, 8:35 PM

#

untold cliff (len(x), x[1]) ?

already tried that

untold cliff Apr 7, 2023, 8:39 PM

#

red rose already tried that

This is weird

stone marlin Apr 7, 2023, 8:39 PM

#

^ Why are there tuples of strings sometimes and sometimes ints?

red rose Apr 7, 2023, 8:39 PM

#

stone marlin ^ Why are there tuples of strings sometimes and sometimes ints?

because it's frequent items

#

9 is a frequent item but 7,18 too

untold cliff Apr 7, 2023, 8:41 PM

#

red rose anyone know how to sort by length and then between the elements with same length...

Is this what you've tried so far ?

red rose Apr 7, 2023, 8:42 PM

#

untold cliff Is this what you've tried so far ?

also this

#

and

#

untold cliff Apr 7, 2023, 8:42 PM

#

How come you didnt get an error

#

When you do len(x[0]) your sometimes doing len on ints which should give a typeerror

red rose Apr 7, 2023, 8:43 PM

#

untold cliff When you do len(x[0]) your sometimes doing len on ints which should give a typee...

uh, so i dont get any error

stone marlin Apr 7, 2023, 8:43 PM

#

Yeah, I cannot reproduce this.

#

But I think first, you shouldn't be mixing tuples of str[int]'s and ints. That seems like a pattern that will not make things easy to work with in your data.

untold cliff Apr 7, 2023, 8:50 PM

#

red rose already tried that

Is there a chance this is a custom output and not just print(sequence)

red rose Apr 7, 2023, 8:50 PM

#

so if i do sort only by len it works but then the second column is not sorted

#

uh no

#

it's not working well neither

untold cliff Apr 7, 2023, 8:51 PM

#

If your elements are all strings then it makes sense

#

You're not getting an error for len(x[0]) because they are strings, and from 9 to 0 they are all smaller because they're just 1 characters, and then you get 2 character string like 11 and 10, and also tuples which are of length 2

red rose Apr 7, 2023, 8:53 PM

#

what can i do to fix it

untold cliff Apr 7, 2023, 8:55 PM

#

Can you give me an example of what your desired output would be first

red rose Apr 7, 2023, 8:55 PM

#

first singleton, then double, then triplets, and each group sorted by x[1] ASC

next valley Apr 7, 2023, 9:09 PM

#

What library is this in?

red rose Apr 7, 2023, 9:09 PM

#

pyspark

next valley Apr 7, 2023, 9:10 PM

#

Sorry i dont know, if it was in pandas i would be able to help

#

PepeKEKWHands

untold cliff Apr 7, 2023, 9:12 PM

#

@red rose try lambda x: str(x).count(','), x[1]

untold cliff Apr 7, 2023, 9:15 PM

#

next valley Sorry i dont know, if it was in pandas i would be able to help

How would you do it in pandas

red rose Apr 7, 2023, 9:19 PM

#

untold cliff <@406830178661171201> try lambda x: str(x).count(','), x[1]

yh now it's working!!! thank you so much ❤️

next valley Apr 7, 2023, 9:24 PM

#

untold cliff How would you do it in pandas

Assuming its a pd series and each element is a list or a standard numbrr sort_by(key= lambda x: len(x) if x is list else 0)

#

Would be the first thing to come to mind

untold cliff Apr 7, 2023, 9:32 PM

#

next valley Assuming its a pd series and each element is a list or a standard numbrr `sort_b...

Thanks! Btw i think it should be isinstance(x, list) (or even better isinstance(x, collections.abc.Iterable) to include tuples and other iterables)

next valley Apr 7, 2023, 9:41 PM

#

Works too

sleek harbor Apr 7, 2023, 10:27 PM

#

This pic demonstrates how the order (degree) of polynomial features affects the MSE of a function. So it gets better, sweet spot, then worse.
So my question is, is it usually like that (better->sweet spot->worse) when tuning hyperparameters? Cus if yes, then wouldn't it be cool if instead of an exhaustive GridSearchCV, or a random RandomSearchCV you could do something like a RandomSearchCV, but with logic to get the best result? For example, 3 random points, create a curve, bottom point, adjust the curve, bottom point, etc., not just choosing parameters randomly? Or would that not work, because not all hyperparameters follow the better->sweet spot->worse rule, and some just produce entirely random, unpredictable results, meaning some random combination might just work better, meaning the only way to be sure u got the best params is a grid search?

Screenshot_2023-04-08-02-14-52-510_com.alensw.PicFolder0101.jpg

untold cliff Apr 7, 2023, 10:49 PM

#

sleek harbor This pic demonstrates how the order (degree) of polynomial features affects the ...

When tuning parameters it's done on the validation data or the training data using cross validation not on the test data. If you try the way you want i think you would be overfitting for the test data.

fallow venture Apr 7, 2023, 10:50 PM

#

untold cliff What is the problem exactly ?

Resolved

#

Thank you

untold cliff Apr 7, 2023, 10:53 PM

#

fallow venture Resolved

Good to know. Can you share the solution with us?