Then you can do the same thing I suggested earlier, but once for each year within the cycle. So for a cycle of length 5, estimate the frequency of appearance using data from 2001, 2006, 2011, etc.; then the frequency of appearance using 2002, 2007, 2012, etc.; then 2003, 2008, 2013, etc.; ending with an estimate of the frequency of appearance based on whether there was an appearance in 2005, 2010, 2015, 2020.
#data-science-and-ml
1 messages · Page 55 of 1
For a 5-long cycle, that gives you five estimates. The one you want for 2023 would be the estimate based on 2003, 2008, 2013, and 2018.
The method I suggested earlier, where you look at the fraction of appearances for all the years, is actually the same as this one, but for a cycle of length 1.
This is not the only thing you can do. If the years within a cycle are completely independent of each other, then it's the maximum likelihood estimate, and you'll be hard-pressed to do any better. But in the example we discussed earlier—two years off, epsilon% chance of appearance, two years on—successive years seem to be correlated with each other.
You can exploit that as follows: Make fraction of appearances estimates for each year in the cycle. Say that the cycle has length C and these fractions are x1, ..., xC. These fractions are periodic data. Assuming successive values should be correlated, you should get better results if you smooth this data.
There are a variety of ways to do this. You could convolve with a kernel. You could fit a Fourier series and discard high-frequency terms. There is stuff about this in the time series literature.
Okay, I have to get going. I hope this helps!
Oh, I remembered one other thing. You might get slightly better estimates of the fraction of appearances if you add one to the numbers of appearances and non-appearances. This is called Laplace’s Rule of Succession and is a kind of Bayesian technique.
@queen cradle can I have a question?
Hi there, if anyone is familiar with PCA, could you see if this plot looks correct in terms of dimensionality reduction. For me it looks like I can reduce the dimensions to 18
explained_variance_ratio = np.array([0.15204826, 0.12169249, 0.07493663, 0.06553111, 0.06038744,
0.05784934, 0.05351325, 0.04958794, 0.04762545, 0.04716847,
0.04626846, 0.04332389, 0.04200419, 0.03658627, 0.03129437,
0.03022434, 0.02267887, 0.01051982, 0.00414997, 0.00241604,
0.0001934 ])
# Calculate the cumulative explained variance
cumulative_explained_variance = np.cumsum(explained_variance_ratio)
# Plot the elbow plot
plt.figure(figsize=(10, 6))
plt.plot(range(1, len(explained_variance_ratio) + 1), explained_variance_ratio, marker='o', label='Individual Explained Variance')
plt.plot(range(1, len(cumulative_explained_variance) + 1), cumulative_explained_variance, marker='s', linestyle='--', label='Cumulative Explained Variance')
plt.xlabel('Number of Principal Components')
plt.ylabel('Explained Variance Ratio')
plt.title('Elbow Plot')
plt.legend()
plt.grid()
plt.show()
how are pytorch conv2d biases and weights seeded?
(Sorry, I was busy. Back now for a little bit.) Go ahead.
I found it, but thanks ❤️
No problem!
Heya maybe I've not googled enough but I'm struggling to find a way to create a grid of subplots with an outer and inner set of axis labels
this is what I mean basically
I'm wondering if it's possible using pyplot or seaborn?
oh also the outer labels would be for categorical variables which i guess is obvious
if i have a tensor of shape (x,y,z,16) and i want to reshape it to (x,y,z,4,4) is there a tensorflow command to do that? any solution i've seen means x,y,z have to be known values but it's subject to change in my case, I just want to square the last axis
Anyone using synthetic data?
https://www.youtube.com/watch?v=ep0PhwsFx0A
Synthetic data is artificially generated data that is not collected from real-world events. It replicates the statistical components of real data containing no identifiable information, ensuring an individual’s privacy.
We have released a new ydata-synthetic version that includes a Streamlit app to ease your process of synthetic data generation...
have you came across the term inset before?
might want to check this out as a demo if not: https://matplotlib.org/stable/gallery/axes_grid1/inset_locator_demo.html
it's a bit manual but probably can do what you wanted
Hey all, it me again
does anyone have any good alternatives (open-sourced) for Tableau?
Does Nbeats time series model have its own documentation. I'm trying to find it
thank you
hello all, what is the best machine learning model to use and implement an Alzheimer prediction model? the model must be able to handle custom inputs and have a prediction percentage as its output. we're building an mvp at the moment if that context matters. im torn between svm and random forests
also, best IDEs for machine learning and training models? i'm not convinced by VSC
(ping for reply)
can someone tell me how I can access each of the inner elements in my tensorflow prediction [[5.0252132e-37 8.4258248e-16 2.4297525e-25 1.6483234e-02 1.2865312e-22
9.8351675e-01 4.9724836e-31 2.6893213e-29 6.2175032e-10 5.4309886e-12]]
these arent seperated by a comma this is realy confusing
you can index as usual
try prediction[0][index_you_want_to_see]
or just comma separated too
omg thank you edd
When in databricks, I wanted to open a csv file directly
Does anyone know how to get this to work?
s3_boto = boto3.client('s3')
obj1 = s3_boto.get_object(Bucket=config_bucket, Key=path)
data = obj1['Body'].read().decode('latin1')
After this, I can write the file but I cannot open any contents I pull in the correct format
thank you for the suggestion! i've decided to include the relevant information in the titles of each of my subplots for now but will keep this in mind for the future
hi, im supposed to see anaconda 3/4 in my jupter folder but its not there. how can i fix this?
what do you mean by "jupyter folder"?
sorry, i meant in the jupyter notebook folders
Alright. I don't use or like anaconda, so I'll let someone else take it from here.
👌
do you want to use anaconda as your interpreter?
sorry i dont know what that means im pretty new, i want to install tensor-gpu
what server do you guys recommend so I can train my models there?
ok. you have installed anaconda already, yes?
google colab is good and free
yes
ok. that's really all you need. anaconda installs its stuff in some directory in your computer, named anaconda3, but you never have to interact with that folder yourself
can you recommend me a good documentation that I can follow?
how?
I see that i have tensorflow but not tensorflow-gpu
right, tensorflow gpu has to be installed separately. lemme see if i can find a good explanation on how to install it, because that also requires some nvidia drivers
huh turns out it's kind of annoying in windows haha
https://neptune.ai/blog/installing-tensorflow-2-gpu-guide here is one guide on how to do it
neptune.ai
Tensorflow is one of the most-used deep-learning frameworks. It’s arguably the most popular machine learning platform on the web, with a broad range of users from those just starting out, to people looking for an edge in their careers and businesses. Not all users know that you can install the TensorFlow GPU if your hardware…
their own is good. it should be fairly intuitive though: you basically run a jupyter notebook on their hardware. you can read about it here https://colab.research.google.com/
hey! there's my kakuro solver
i thought that code is "well" written but i'm not happy with the output so i'm looking for help what i can improve or change
Hey @rotund cove!
You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.
the output looks like this:
Fitness value of the best solution = 97
Number of generations passed is 5000
Best solution found is:
['*', [0, 17], [0, 6], '*', '*']
[[11, 0], 1, 1, [0, 24], '*']
[[17, 0], 1, 1, 1, [0, 3]]
['*', [11, 0], 9, 9, 9]
['*', '*', [11, 0], 1, 9]
Correct solution is:
[['*', [0, 17], [0, 6], '*', '*'], [[11, 0], 8, 3, [0, 24], '*'], [[17, 0], 9, 1, 7, [0, 3]], ['*', [11, 0], 2, 8, 1], ['*', '*', [11, 0], 9, 2]]```
how are pytorch conv2d biases and weights seeded?
wdym by seeded
that's probably in the docs, isn't it?
torch/nn/modules/conv.py lines 40 to 47
def reset_parameters(self):
n = self.in_channels
for k in self.kernel_size:
n *= k
stdv = 1. / math.sqrt(n)
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
self.bias.data.uniform_(-stdv, stdv)```
first google result 😛
cool bot feature
so its a uniform distribution with a stanfard deviation of 1/sqrt(n)
and n is input channels
pog
you'll have to test this in your code and see, i don't know. but since it appears to return an iterable, it seems to be all sizes
so 4
test and see to make sure
so test and see to make sure
ok ty Edd ❤️
so it can be only jupyter?
on colab, yes
you can write all your code in regular .py files and then just import them and call them from the notebook
okay thank you
is there anyone that can help me with my code i am trying to make it so that the loads audio tracks in to a KNN and then uses inmput from a microphone in order to test it
what's troubling you about it?
i dont know if its actually training correctly and i tried to get it to stop with 'p' but it just prints the same thing over and over doesnt help that im new and have been trying to do it based on what little ive managed to find out
a good way to check that is to print the loss at every epoch
i know im going to sound dumb but how do i do that also whats that mean exactly
oh sorry i misread that as cnn haha i thought you were using a neural network
so you're doing vector quantization of audio
the best way of testing this would be to play back the encoded audio. looking at the numbers won't tell you all that much. mostly because of psychoacoustics (how sound is perceived vs what it actually is)
you can plot the original audio vs the encoded audio, but in general the best test is to play it back and listen to it
sorry i dont know if i explained it properly but what i need to happen is for the mic to pic up sound a compare it to the training data and play a sound depending on the class prediction
Doing that without a neural network is a bit strange 
Oh, I get it now. Do you want the sound to be generated? Or will it be a sample from your training data?
If it should be generated, then I really don't know how to do that with KNN...
huh that's very different from what i thought you were doing. in that case i also find it weird to do it without a neural network. what are the vectors you're using for knn?
its just a test to see if this system would work on weather on not it can correctly plot the sound of the mic
and as im new to this i have to make it simple or id be completly lost
Uh... Dealing with sound, per se, can't be simple
Unless there's situations where you can disregard mel-spectrograms, fourier-transform, etc. 
Got a question about Ordinary Least Squares & Non-Negative Least Squares.
What if you have a bunch of parameters, and some of them can be negative, some can't.. in scikit learn u can do positive=True to make all coefficients non-negative, but how do you make it so that only some select coefficients will be forced non-negative (for the always positive parameters), and the rest can be whatever?
the reason you use a mel transform is because the ear perceives the spectrum in a warped scale, not linearly. the frequency axis is compressed logarithmically, and a specific amount of attenuation is attributed to each frequency. similarly, the ear is more sensitive to some frequencies than to others, so there is a reference mask that predicts how well you'll be able to hear distortions at different frequencies. these are all things that only matter specifically because of how humans hear sound, there's nothing special about it otherwise
But then...shouldn't it be used in this case, too?
there are other weird effects like intensity and temporal masking. loud, short duration sounds make it impossible to hear other sounds after, but also before them
oh yeah, that was a typo lol
idk why i typed no. that's my default response to everything, i guess
lol
But then... I guess that, unless you're making an algorithm for biology purposes or detection of wave sounds that can't be heard by humans, mel-spectrograms should be used by default, right?
you could try constrained optimization instead
well, most sound is inaudible to us
i work a lot with ultrasound
very low frequency sound is also just perceived as vibration. this includes most of seismology
Ultrasound, eh?
Do you use spectrograms directly? Energy spectrograms?
every now and then. i usually use parametric methods instead
yeah
And, is that how the ultrasound images are generated, by chance?
that's not quite what i meant by parametric methods btw. i meant more generally, parameter estimation
I see... Guess I'll have to take a look at that.
Have a bit ambitious project involving ultrasound sensors, but in micro/nano scale
I am creating a dataset for speech processing, I have a json file that I want to modify by adding 199 to every value in the index
basically i need to change the 1 in the first position as well as in audio path to be 200 and 2 into 201 and so on
No idea what that is.. thanks for the reply, I'll look into it. U've gotta be some kind of genius. I see u here all the time, u've got an answer to all questions, and most importantly - u got a pixel cat pfp 💯
||Edd = Math oracle
||
Hi everyone. I need help here with a plotly scatterplot. The units for CO2 flux on the y axis are wrong. It says µ, from 0-~160µ. But in my table it is in g / m^2 / s (around the 0.000x mark). I'm not sure what is happening here. Any suggestions?
Also, I want the y axis to read as CO_2 flux g^-2 s^-1. How do I write this?
The u is micro. There isn't any units there rn
Why aren't the data displaying the correct numbers on the y axis then?
0.0005 = 500µ . Does this make sense for ur data values?
Hmmm. Yep. I think it does thanks. Do you know how to fix this?
not sure. proly gotta read the docs
same for adding in the units u want it to display
alternatively u can put the units in the title
CO2 (g / m^2 /s )
Awesome. Thanks. How do I write that line in python?
thanks for your help homie
<ipython-input-39-19626fc93a6d> in <cell line: 2>()
1 from mlxtend.plotting import plot_decision_regions
----> 2 plot_decision_regions(x_train,y_train.values, clf=clf , legend = 2)
1 frames
/usr/local/lib/python3.9/dist-packages/matplotlib/axes/_base.py in axis(self, arg, emit, **kwargs)
2125 self.set_ylim(ymin, ymax, emit=emit, auto=yauto)
2126 if kwargs:
-> 2127 raise _api.kwarg_error("axis", kwargs)
2128 return (*self.get_xlim(), *self.get_ylim())
2129
TypeError: axis() got an unexpected keyword argument 'y_min' ```
this is my error
plot_decision_regions(x_train,y_train.values, clf=clf , legend = 2) ```
this is my code
I don't what is the error here
just guessing here but maybe a package version mismatch?
how to resolve that I'm new here
its just a google away bruh
literally first search result
@cold osprey it is my first project so I don't know anything ☺️
[NLP]
Hi guys, thought it wouldn't be appropriate to spam my entire question here. But if anyone has any insights into applying BERT models on large bodies of text I would really appreciate it
https://discord.com/channels/267624335836053506/1092740527495061585
Perhaps I could also try finetuning it in relation to Central Bank communication? or apply a weighting to the sentiment with the specific target words in the text section
Really open to suggestions!
Not sure if this is the right chat for this, but I'm a total newbie to python and coding. For my medical physics internship project I have been given a task surrounding the processing large 2D array data files in .xcc/XML format and making a bunch of graphs to help display results and tolerances. I have some code from a colleague to help me along but I'm really out of my depth. If someone in here is a wiz in this area I would super appreciate some help as I don't have an awful lot of time.
I don't have an awful lot of time.
if you essentially need someone to do it for you, you probably won't get help here. that said, you have to give all the information needed for someone to start helping, or no one will try to help.
Ahh I see. I don't need someone to do it for me, I think what I need is someone to review what tasks I have to perform, and tell me where I can find accurate resources for what I need to achieve, or even better, teach me how to complete the tasks. I'm happy to give more information, though it is quite a lot to type out, hence why I thought maybe it would be easier for someone to screen share with me and review the task and code I have for themselves.
I'm currently working with a pretty huge tensor multiplication, that I had working in numpy for a smaller version. There's complex numbers and item assignment is a pretty handy way to do some operations (like array[..., 0, 0] = 123), so started to look in to GPU acceleration.
I 'm using an M1 Mac, which pytorch supports, but apparently doesn't like working with complex numbers yet (which is a core part of what i'm doing) "TypeError: Trying to convert ComplexDouble to the MPS backend but it does not have support for that dtype."
I then looked at tensorflow, but it doesn't support item assignment (I guess I could make it work, but it would be a fair bit more complicated).
I also don't think JAX supports mac m1 gpu yet but i'm not sure.
Pytorch would be the most ideal if anyone knows a way of getting the gpu to play nicely with complex numbers.
Any ideas?
in fairness, pytorch also doesn't exactly support assignment. if your operations depend on the values of a tensor and you modify any of its entries and then use the tensor again, you'll get an error at some point down the line because the state of the tensor changed unexpectedly. to get to your actual question, i was under the impression pytorch did support differentitiation with complex parameters using wirtinger calculus https://pytorch.org/docs/stable/notes/autograd.html#complex-autograd-doc but if not, you can always split your function into real and imaginary parts (has to be done carefully)
Is there a better library to run ANOVAs other than scipy?
how does a model training work exactly? I mean what's not clear is that how can I improve and optimize the model by running a python file, if that is gonna execute after run-time? So basically if it is not chaining anything "CONSTANT", then what is the point of doing that?
I just found out that the correlation coefficient measures relationship only and doesnt imply causation at all, so does this mean that including vatiables with good correlation with the target doesnt mean that they would actually be good for the model?
what kind of model?
just a basic model.
there's no "basic model" about which one can make general statements that apply to all model training.
I'm more curious about the "how", generally.
have you seen 3blue1brown's series about neural networks?
(there are lots of models that aren't neural networks, so statements about "basic neural networks" do not apply to all models.)
but how does the training work, exactly.
try watching that video series.
My page looks completely different
That code does not exist on the main branch and the branch that claims to be on also does not exist
I can’t find it anywhere
Hi peeps,
Not sure if this is the right area to ask but is anyone able to give me their opinions on which of the following setups are better?I'm planning on getting a new desktop.
I'm mainly looking to train and run both Vision and LLM(E.g llama 7b or 13b 8int) models locally
CPU
Intel Core i9 13900KF | 24 Cores 32 Threads
COOLING
AFTERSHOCK Glacier Mirror 360mm Watercooling
MOTHERBOARD
Gigabyte Z790 Aorus Elite AX D5
GPU
Gigabyte RTX 4090 Gaming OC 24GB
RAM
32GB ADATA Lancer RGB DDR5 6000MHz (16x2)
SSD
2TB Lexar NM710 Gen4 SSD
PSU
1000W FSP Hydro GT Pro 80+ Gold
VS
CPU
AMD Ryzen 9 7950X Processor
COOLING
AFTERSHOCK Glacier Mirror 360mm Watercooling
MOTHERBOARD
Gigabyte B650 Gaming X AX
GPU
Gigabyte RTX 4090 Gaming OC 24GB
RAM
32GB ADATA Lancer RGB DDR5 6000MHz (16x2)
SSD
2TB Lexar NM710 GEN4 SSD
PSU
1200W FSP Hydro Ptm Pro 1200W 80+ Platinum
the main concern for model training is going to be GPU size. It looks like both involve the 4090 at 24GB, so you'll want to confirm that the LLMs you want to work with take up less space than that.
AMD Ryzen 9 7950X Processor what's the core and thread count for this one?
Heya, thanks for taking the time to reply 🙂
The processor is 16 core, 32 threads
looks like the first build is generally better, though since you have the same GPU and RAM in both builds, I'm not sure how much it will impact your training.
Thanks, your opinions helps!
I'm guessing that the intel build is better because of the higher core count?
im thinking that the only way my training will be affected will be in CPU bound tasks e.g data pipeline processing, augmentation etc🤔
right. though if you're training on the GPU, then you're not using all those extra cores.
im thinking that the only way my training will be affected will be in CPU bound tasks e.g data pipeline processing, augmentation etc
how much of that can you actually do in parallel?
and even if you could, how much time would it actually save you?
Most of the time 2 workers is enough for loading in the data and augmenting
those are some good points, i gotta think about this more😅
i didn't know that o.0 i usually just put as many workers as i have cores 🤣
Edd
why speaketh thou the blessed name?
Edd my savior
He smartest person on the internet
And Edd knows PyTorch
you might as well just ask your question as if Edd were here. Maybe he'll answer it later. Maybe someone else can answer it.
Do y'all ever use residual plots such as sns.residplot()? Or are they more of a "cool, that's possible, never gonna use it tho" kind of thing? Can I get some use case examples?
Why does PyTorch claim kernals are 3x3 but Edd say it does a convolution with a 3D kernal
When I look at how weights are initialized it uses self.kernal_size
When I did conv2D(999, 998, 997), self.kernal_size is 2 because the kernal is 2d
Also the GitHub page Edd referenced is non existent
I have this idea for a project. TLDR of it is to train how a player plays a game by looking at replay files and the corresponding level and generate a replay file from a new level. But there's so many issues:
- How do I link a replay and level?
- Do I have to split up the replay into it's keyframes or the whole replay at once?
- How does the encoder encode an entire file?
And so many more. I have no idea where to begin to start. If someone can help me find some resources or what to do to get started, I'd appreciate it. I've been looking and I haven't found anything that's remotely to do with how to do this :(
im guessing by 3 x 3 kernels, you're referring to pytorch's nn.Conv2D?
Yes
I assume you would train it the same way you train cars to drive from video frames
what's the context behind the convolution with a 3D kernel?
Or depending on the game you can try reinforcement learning
Using openCV to read the screen
The whole point of the project is to play with a player's playstyle, and replays are really the only way to get that information
What game
trackmania
seems at some point they replaced the xavier init with kaiming init https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/conv.py it's also a uniform distribution, just with a different range
the replays have the player inputs and positions of the car because they have some weird validation thing that doesnt work
Conv2d doesn't have a default 3 x 3 kernel? Do you have a link to pytorch saying that kernals are 3 x 3? From the convo you linked, im guessing that edd was referring to the 2D kernel of size (2 x 2) with the group setting being set to the default of 1 as per https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html ?
By 3x3 I meant nxm meanings 2 dimensional not 3
is there any advice for combing datasets for a multilabel classification cnn
hi ,
def plot_the_model(trained_weight, trained_bias, feature, label):
"""Plot the trained model against the training feature and label."""
# Label the axes.
plt.xlabel("feature")
plt.ylabel("label")
# Plot the feature values vs. label values.
plt.scatter(feature, label)
# Create a red line representing the model. The red line starts
# at coordinates (x0, y0) and ends at coordinates (x1, y1).
x0 = 0
y0 = trained_bias
x1 = feature[-1] /// WHAT IS THIS ??????????????????????????????????????????????????????????
y1 = trained_bias + (trained_weight * x1)
plt.plot([x0, x1], [y0, y1], c='r')
# Render the scatter plot and the red line.
plt.show()
could someone tell me what is feature[-1] ?
im trying to learn TF and im fairly new to machine learning
how could feature order be in negative ?
at a glance, it looks line you're fitting a straight line
correct
this would draw a line segment joining the first and last points
broadly speaking, list[-1] / array[-1] in python = list[len(list) - 1] / array[len(array) - 1]
ohh now i see
thanks a lot
feature is your vector of input values, the x values. feature[-1] is "the last element of the feature vector", hence why it "joins the first and last points" on the line
that make sense , thanks a lot ❤️
guys, i'm building an application on streamlit. I have a dataframe, and I made an aggrid table based on it. Then, a I made a groupby in the filtered data from this aggrid table. Now, I want to make a new aggrid table, based on this group by, and this group by updates every time I filter the original aggrid table. The groupby update I can do, it's done, but the aggrid table of this groubpy is not working
as I update the groupby data, this new aggrid table should update as well
Somebody has use yolov8 for measure object?
sklearns PolynomialFeatures vs numpys polyfit.. is there any point in using numpys version when sklearn does the same thing? Like, if I'm more comfortable with sklearn, can I just stick with it, or are there drawbacks?
use sklearn's
numpy's method might be preferred if you want to do complicated statistics and/or want full control over what your code and model are doing, but if you are more focused on the end result / the predictions than the details of the method itself, just use sklearn
this CNN tutorial I'm following made a model and initialized it with model(4). He passed an int as the parameter - is that normal?
depends - that "model" is which function or class exactly?
img = cv2.imread('1902539.jpg')
plt.imshow(img)
plt.show()```
whats the problem with my code?
w/o the py
what happened to tell you that it's wrong?
unless it's an obvious syntax error, it's usually not possible to just look at code and know immediately what's wrong with it.
or if it is, it's unnecessarily difficult, as compared to figuring out what's wrong with it when you have some information about its intention.
hi,
I have 2 questions on this
model.add(tf.keras.layers.Dense(units=1,
input_shape=(1,)))
what does units =1 , input_shape=(1,) means ? and why do we use input_shape=(1,) and not input_shape=(1) ?
im completely new to ML so forgive me if it was a noob question 😄
nvm chat gpt explained it for me LOL
what would the humans do after the ai takes over ? 😄
chatGpt: We use input_shape=(1,) instead of input_shape=(1) because the input shape must always be a tuple, even if it only contains one value.
I have normalised my data by subtracting by the mean and dividing by the standard deviation but a lot of my features still have a pretty big range. They don't all fall between -1 and 1. They also don't all have a mean centred at 0 will this be a problem for my neural network
Regarding your concern about the range of some features not falling between -1 and 1, and the mean not being centered at 0, it's important to remember that normalization is just one step in preprocessing your data. While it can help, it's not always necessary for all neural networks.
If you find that your neural network is not performing as well as you'd like, you could try different techniques for normalization, such as min-max scaling or using feature scaling methods like logarithmic scaling. You could also try other preprocessing techniques like feature engineering or dimensionality reduction to see if they improve the performance of your model.
Overall, it's important to test and experiment with different preprocessing techniques to find the best approach for your specific use case
a bit of recursion humor written and voiced by AI
https://youtu.be/pIhR1CFGYVQ
*auto subtitles not accurate. Void where prohibited.
Join the Slack discussion at https://join.slack.com/t/northsparkdef-kpr3977/shared_invite/zt-1rc4hgg8y-SschYAlEzVASY1pNmj~7aQ
Our list of links - https://msha.ke/north
https://www.linkedin.com/company/north-spark-defense-laboratory/
can someone help to make code visualization of data
I need help
OpenCV error
---------------------------------------------------------------------------
error Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_11504\1904821197.py in <module>
88 #print(outputs[2].shape)
89
---> 90 count=findObjects(outputs,img)
91 inccount+=count
92 countReset=int(time.time()-startTime)
~\AppData\Local\Temp\ipykernel_11504\1904821197.py in findObjects(outputs, img)
65 #print(classNames)
66 #print(int(confs[i]*100))
---> 67 cv2.line(0,(int(img.shape[0]/2)+3,int(img.shape[1]),int(img.shape[0]/2)-3),(0,0,100),1)
68 cv2.putText(img,f'{classNames[classIds[i]].upper()} {int(confs[i]*100)}%',(x,y-10),cv2.FONT_HERSHEY_SIMPLEX,0.6,(255,255,0),2)
69 return count
error: OpenCV(4.7.0) :-1: error: (-5:Bad argument) in function 'line'
> Overload resolution failed:
> - Can't parse 'pt1'. Expected sequence length 2, got 3
> - Can't parse 'pt1'. Expected sequence length 2, got 3```
Here's the code: https://paste.pythondiscord.com/lutebijuru
Hey @arctic moss!
It looks like you tried to attach file type(s) that we do not allow (.heic). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.
Feel free to ask in #community-meta if you think this is a mistake.
does this mean my model is overfitting? My validation is orange and training is grey
what algorithm I should use if I want to extract key points from some data kinda like in these screenshots.?
by key points you mean critical points? where the slope is 0?
yeah
i'm not sure if there's a built-in function for that. one thing you can do is use a finite difference scheme and mark points where the gradient changes sign
then interpolate between those points
https://numpy.org/doc/stable/reference/generated/numpy.gradient.html this can do the finite difference part for you. alternatively you could use cubic splines from the get-go and differentiate those
ah wait i remembered the name
yeah, im a 3d animator working on mocap data . in the software im importing the data to (shown in th above screenshots), the interpolation isdone automatically so i don't neeed to worry about that. more just picking out the points where there's the most difference
thanks a bunch
hmm maybe i misunderstood the question, given what you just said
you want to compare the two curves to each other?
nope Just wanted to find the peaks and trough points.
ok. then yeah, peaks 😛
Is this the best channel to ask a question about pyomo?
if not, I put my question here: #1093093447302402069 message
Sigh
After diving deep into studying GANs...time to dive deep into Transformer... Looks like its simplicity ends up when the implementation ends and training begins.
But I shall do this while my prototype of Text GAN is training 
I was trying to use Latex within matplotlib but I am getting this error
`! LaTeX Error: File 'type1ec.sty' not found.
Type X to quit or <RETURN> to proceed,
or enter new name. (Default extension: sty)
Enter file name:
! Emergency stop.
<read *>
l.8 \usepackage
[utf8]{inputenc}`
i searched and all the results are for linux. i am on windows 10
what were you trying to do?
Guys somebody know can I get predictios if I use Recbole and Bert4rec?
Is it wrong to impute missing values using simple startegies like replacing with the mean for example even if it would give better results than imputing using knn for example because even though it preserves the mean of the non-missing values it biases the variance and covariance towards 0 and so we might get opstimistic results ?
Suppose I give you data with two populations: There are 1000 samples with values clustered very tightly around (0, 0) and 99 samples with values clustered very tightly around (1, 1). There is one sample of the form (1.01, ?), where the ? is missing data. What do you think the ? should be?
Another situation. Suppose I give you a time series. It has a slow, periodic up-and-down cycle. On one of the up cycles, it goes up, and up, and keeps going up, and suddenly there's missing data. After a few missing data points, we observe some very high data values, and the values go down, and the same cycle as before is restored. What values do you think should be imputed?
In my opinion not the mean but the most likely value which depends on its closest neighbors i guess
Statistically, you're using nearest neighbors as a density approximation: Basically, you're saying that there's a density function (which has some shape you understand from the samples where you have data), and you're looking at the conditional density with respect to the data that you have (for the sample where you're trying to impute missing data). There are lots of ways of constructing density estimates, but nearest neighbors is a good one.
Replacing by the mean ignores all the information you actually have, so it's usually a poor choice.
And in the time series example I gave, you have information even though there are times where you have no data. This is because successive values of the time series are correlated. (The situation I actually had in mind is that the data comes from a sensor that reports "error" when the reading is too high.)
Actually now that i think about it it might be good to replace with the mean in case of a variable with low variability maybe ?
So its ok to replace with the mean in the 1st example ?
The first example is a bunch of data clustered around (0, 0) and a bunch of data clustered around (1, 1). The mean (and also the median) in the second coordinate will be around 0 because the subpopulation around (0, 0) has more members. If you have something whose first coordinate is near 1, and you guess that the second coordinate is near 0, then you're predicting the existence of something near (1, 0), where you have literally zero data and so zero reason to believe that your imputed data should be. You should impute a second coordinate near 1.
Looking for professional ai developers I need help with a simple program made using ai if that sounds like you please dm
Ask your question here @fringe ermine
I see, that's makes sense. Are there any cases where imputing with the mean (or median, most frequent ...) is more appropriate? (I believe these are called hoc methods right ?)
hi ,
I have a question about this pic
Im not familiar with this type of left hand side , i dont know how its processed and how it's ok to do (x1,y1) , (x2,y2) = someFunction() , how is this ok ??? how is this not raising any errors ?
chatgpt answered 😄
The syntax x, y = someFunction() is called "multiple assignment" in Python. It's a shorthand way of assigning multiple variables at once, based on the values returned by a function or another iterable object.
When you use multiple assignment, Python automatically unpacks the values returned by the function or iterable object and assigns them to the variables on the left-hand side. For example, if someFunction() returns a tuple of two values (1, 2), then x, y = someFunction() will assign the value 1 to x and the value 2 to y.
As long as the number of variables on the left-hand side matches the number of values returned by the function or iterable object, and the types of the variables match the types of the values, then the assignment will work without raising any errors.
Overall, multiple assignment is a convenient feature of Python that can make code shorter and easier to read. But it's important to be careful when using it, especially with complex functions or iterable objects, to avoid unexpected behavior or errors.
This is how to unpack a tuple of tuples into 4 separate variables
!e
var = ((1, 2), (3, 4))
(a, b), (c, d) = var
print(a, b, c, d)
@mild dirge :white_check_mark: Your 3.11 eval job has completed with return code 0.
1 2 3 4
Doing it without brackets gives this
!e
var = ((1, 2), (3, 4))
a, b, c, d = var
print(a, b, c, d)
@mild dirge :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "/home/main.py", line 2, in <module>
003 | a, b, c, d = var
004 | ^^^^^^^^^^
005 | ValueError: not enough values to unpack (expected 4, got 2)
Ah, thought you were confused about the gpt answer, nvm
thanks for explaning it further but omg im fasinated by how chatgpt unpacked my trashy question and what is also amazing is how explained it in a very detailed answer 😄
Yeah, sometimes..
sometimes ? u had bad exp with it ?
It's just not reliable. It will tell you some incorrect stuff with full confidence. It's great for inspiration, or getting information that you can then verify yourself. But not just to ask for facts 😛
But for these kinda questions it's pretty nice indeed
oh yeah i agree but the way i see it, that it has reached the moon with a lighting speed, yes it hasnt reached every inch of the universe yet but looks like it wont be a problem in a near future if it's kept growing that way 😄
elon musk is right
I'm trying to use scipy's curve_fit. But I'm having trouble getting it to accept my mapping function because I think it should be a fractional logarithm, and
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from math import log
def bestGuess(x, a, b, c):
comp = a * log(x, 1/b) + c
return comp
plt.scatter(Msol, M_v)
curve_fit(bestGuess, Msol, M_v)
plt.legend()
plt.show()
I don't think numpy.log let's be define the base, so I resorted to math.log, but I dunno if that's the issue.
right now the error I'm getting is TypeError: only size-1 arrays can be converted to Python scalars for the line in my bestGuess function
what is the best way to use huge data-sets, I want to work with the "openwebtext" data-set but I don't want to download 12GB directly.
I think it would definitely be simplest to just download the dataset to your local machine
12 GB is not a lot, most machines have 16+ GB ram and a few terrabytes of storage
I guess x is an array, and i dont think math.log takes arrays as arguments but scalers only so this might be the issue
Yeah, I think you're right. I'm having trouble understanding though how to give it what its looking for. I changed the expression, and think I'm getting closer. But (as far as I can tell) it looks like I'm giving it lists when it wants lists.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
Msol = [59, 48, ... 0.075]
M_v = [-5.8, -5.5, ... 20.5, 20.9]
def bestGuess(x, a, b, c):
comp = a * np.exp(-b * x) + c
return comp
popt, pcov = curve_fit(bestGuess, Msol, M_v)
plt.scatter(Msol, M_v)
plt.plot(Msol, bestGuess(Msol, *popt))
plt.xlabel('Msol')
plt.ylabel('Absolute Magnitude')
plt.show()
Now I'm getting: TypeError: can't multiply sequence by non-int of type 'numpy.float64' on the comp= line.
I understand that I'm passing Msol to it, which is a list, and not a float.
I successfully recreated PyTorch conv layer and the output is identical, that is one less point of failure in my custom nn
But I don't know how else to write the line to call it. It seems like I'm doing it correctly, in comparison to the tutorials I'm reading
Yeah you have to convert it to a numpy array in order to be able to perform multiplication like that. When you multiply a normal python list by an integer you're essetially repeating the list, like [1, 2] * 2 gives [1,2,1,2]
Though its a float in your case not an int and float isnt a valid operand type for this operation
(scikit learn)
When preprocessing, you're supposed to use fit_transform on training data and just transform on testing data, right? To prevent data leakage. But does that rule apply to PolynomialFeatures? I mean.. I don't really see what could possibly be leaked here.. So can I apply PolynomialFeatures before splitting a dataset into training&test data, or would that lead to some problems?
Ugh. I was afraid of that. It seems impossible to find a way to use a logarithm in a fitting function. 😭
No just add a line to convert x to a numpy array and the operation -b * x would work
x = np.array(x)
Yeah, this does not leak the data. Nothing about the generated features is affected by the feature values of your test data
Or np.asarray seeps more appropiate here i guess
As long as you did not decide to use this on the basis of the features of the test data, it's fine
Ah, that's it! That's gotten me closer!
Ugh, my new fit unction isn't very good. I need to go back to using math.log somehow
How back propagate through convolutional layer?
PyTorch doesn’t calculate gradients directly so I have nothing to test against
numpy's log functions don't allow you to specify the base as far as I can tell? like math.log does. But if I've converted x to a numpy array, it doesn't seem like I can use math.log on that data
You can use a list comprehension like [math.log(i) for i in x] and convert the result to a numpy array if you need but there's probably a better way to do that
Oh yeah logn of emath works
Btw log(base)(x) is just log(x) / log(base)
I think I had that on a test yeaars ago 🙂
Haha, well it seems you have never used it outside that test.
hmm.... I think its failing now because of a divide by 0. Is there a way I can the variable in the denominator isn't allowed to be a 0 when doin curve_fit?
Where does the division happen ?
In my model function
def bestGuess(x, a, b, c):
x = np.asarray(x)
comp = a * np.emath.logn(x, 1/b) + c
return comp
hmmm... perhaps using np.divide instead?
You mean 1 / b ?
yeah
Well i dont think you're passing a base 0 to your function, so no zero division error here
The problem might be somewhere else
No, that's where it is. I'm not passing a 0 directly. but curve_fit tries a bunch of variables in a b and c to fit the curve.
At least I think so
RuntimeWarning: divide by zero encountered in divide
return nx.log(x)/nx.log(n)
so maybe it's in emath.logn
Hmm, i dont know how curve_fit works but if tye problem is as you said then you can specify bounds for the parameters to be tried
Refer to the documentation of the function, bounds parameter, might be the solution
Hmm if emath.logn returns this, then the problem is in nx.log(n) which happens to be 0 when you pass 1 to it, so try excluding 1 from base bounds maybe
I think you're very likely correct about bounds. But that's just giving me a new warning, regardless of what I set bounds to. Curve fit is calling least_squares, and that's giving me:
ValueError: Residuals are not finite in the initial point.
And that's after trying a number of different bounds. Right now I'm using:
popt, pcov = curve_fit(bestGuess, Msol, M_v, bounds=([np.NINF, 2,np.NINF], np.PINF))
But even if I just use big and small ints, I get the same ValueError
like bounds=([999,2,999], 999)
Hmm, then the problem is in the first point of your input and the value your function returns
Like of its negative or something or very close to p then you'll get infinity because there's no log(0)
Try inspecting it, this might be the problem
Yeah, that does seem possible. What do you mean by inspecting?
See the output of your function for the first value
Ah I see. Printing the comp value does have a -Inf burried in it. Hmmm
Ah, because there is a 1 in that same index of Msol
why do loss function's mathematical form always have y binary variable but we dont have y in unsupervised learning, But i dont see their unsupervised way mentioned
that's because in unsupervised learning you don't have a reference y
you have some function of the input that is then inverted
the target is not norm(y - f(x))
i am confused because i was asked by my prof. to applly contrastive loss of tensor of shape (bs*tokens, embedding size) and told he after contrastive loss i should have (bs*tokens) values
it's norm(x - g(f(x)))
i dont see how, what he said is possible
I guess the easiest solution is just deleting that point. However its a shame as its the single most certain point.
idk what contrastive loss is
ok, so a quick google search says its an example of what i said
you give an input x, and you want to produce another x that is close to it
hmm, actually its a little deep, google search probably wont be helpful here.
My understanding is it takes triplet, BUT triplet/ranking/ etc seems same but are diff.
I dont know still how he wanted me to make pairs
the wording in this is a little confusing. wdym by "after the loss"? the loss functions most commonly employed are scalar-valued. you can average over several examples, sure
if you can word the problem a bit more generally, i can give you a hand. i'm somewhat familiar with optimization tasks. i don't know much specifically about this task you're talking about, but off the top of my head it seems similar to what nonlinear component analysis does: try to learn a metric
in hard mining and other scenarios, you have the option to do mean/sum/max at vector of values
so values means just a step before scaler was calculated
this is very different from what you said your prof told you
this is BEFORE computing the loss
i can explain, one sec.
So, lets talk about Noise Contrastive Estimation first, it is very similar to something like UNSUPERVISED cross entropy loss with temp. what we do is following:
you need
anchor (bs, embed size)
positive (bs, embed size)
negative (bs, embed size) (optional, can also you positive from other pair
then do
res = anchor @ positive.t()
Now along diagonal you have similarity score from correct pair, other positions have incorrect pair
if you do vec = res.diag() you have a vector, Now you can do hard mining by doing max(vect)
now here values accounts the number of scaler in diagonal.
@wooden sail i hope this connect dots of contrastive, values, etc
this info does not help at all 😛
what do anchor, positive, negative, res, bs, and embed size mean?
we can talk about unsupervised learning completely separately from your specific application and cost function, as it's a framework
can assume NLP task, anchor is just test, positive would be similar text, bs is BATCH size, embed size is size of representation of token
ok, sure, you're taking an inner product and applying cauchy-schwarz
i forgot what that was, i read it 2 years back
it's the only thing you needed to explain the scenario 😛
I'm still pretty off target, as Msol gets larger. But this kinda feels like magic
This looks a lot like the function 1/x but shifted
you can use a change of base with logs
I thuought so too. But adding a 1/ in there didn't doo much for me last I tried, I think because a scales it
i think i will stay confused until prof explain, lmao
Yeah, @untold cliff got me hooked up with np.emath.logn() which is what I needed. And showed me the old base change formula too.
all right
also fyi, your problem is not convex. it's littered with local optima and your result will depend on your initial guess of the parameters
Yeah, I'm doing my best to narrow in that initial guess. What do you mean by local optima?
places where the gradient becomes zero and the hessian is positive definite, but are not the true solution
Ah, I see. It's all the data I have to work with though.
You mean my model function?
yep
Not sure how to find one that is more likely to work?
comp = a * b/x + c yields this:
it's just a lesson that optimization is difficult. even if you know perfectly what the model is, it may be impossible to find the parameters
Hmmm... maybe I can use the covariance matrix it produces to to get better results?
Any other suggested next steps? It feels like
comp = a * b ** x + c
Is pretty close as far as the shape is concerned
i don't really have any recommendations. how about a/x + c
That's this one: #data-science-and-ml message
Did you try setting the bounds for a and c ?
Hmm... actually, I think restring the bounds of a b and c might...
lol. yeah thrying to be more clever about that now.
Try some online plotting tool, playvwoth therz values a little to get a feeling of the bounds
So doing it like this is fine?
`poly = PolynomialFeatures([degree=2])
X_poly = poly.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X_poly, y[, test_size=0.25, random_state=None])
lr = LinearRegression([positive=False])
lr.fit(X_train, y)
Yhat = lr.predict(X_test)`
Btw, you only need a / x + c, no need for be as it would be mutliplied by a and thus just another coefficient
adding an extra parameter that cannot be separated makes the estimation task more difficult
try with just parameters a and c
nice. which model?
comp = a / x**b + c
[ 19.12270031 0.2224922 -14.09644528]
interesting
Congratulations 👏🎉
Thank you @wooden sail and @untold cliff for all your help!
hey,
could someone look at this tutorial and explain to me why 128 exactly ?? does it matter if i changed it ? https://youtu.be/bemDFpNooA8?list=PLQY2H8rRoyvwWuPiWnuTDBHe7I0fMSsfO&t=198
im still trying to wrapp my head around AI
It's a bit arbritrary, some values will work better than others. @fiery jungle
It just means that layer has 128 nodes
ah ok , that means i could try something like 64 or 256 and get a better results , right ? there is no real logic behind it or a certain principle to follow or calculate, correct ?
There is a bit of logic behind it. The value can determine a lot of behavior of your model, and there is a bit of predictability. Often people try a few values, see how well the model performs, and then stick with it.
Lower values will generally give less of a chance to overfit than higher values, but higher values are more likely to model a more complex function.
holly cow look what chatgpt said when i feed him the same question !!! did he just watch a youtube video ??
thank you so much ❤️ ❤️
I'm watching a tutorial on a game that has nothing to do with skateboarding..
Are you using gpt4?
yes !
use it on the browser, some apps claims that they have chat gpt4 but they actually have a crappy old version of it.
No this is just gpt3
yes with better model
i litterally just copied and pasted the same message i sent here as shown in the picture
i wonder if it analized the video or just read context of the subtitles
try asking it what is the title of the video, then try again with a more recent video from after the training data cutoff
hmm ok i will ask for a recently released video
he got it right !!
that was released 19h ago
are we in the future ? 😄
huh, pretty neat and a little surprising, but to be fair considering the features the BingGPT and GPT4 demos promised it does seems reasonable
what is concerning is , if this massive futuristic genius is what they have released to the public, .... what are they hiding up their sleeves?
it has been brought to my attention that 40 out of 43 (i guess 41 out of 44 now) conversations involving toeplitz matrices in this server have involved me
Edd how convolution backpropagation
ah that's pretty rough
coincidentally, the easiest way would be to either pass through a large toeplitz matrix and rewrite that as a sum, or directly rewrite the convolution as a sum, then differentiate that
it builds a computational graph and uses that to only differentiate simple functions
in this case, since convolution is linear, it's addition and multiplication composed with whatever activation function you use
how get same results as pytorch but by using actual derivatives
pytorch uses actual derivatives
but you can always do this
easiest isnt best
the best is to build your own computational graph, if you want to be able to arbitrarily nest your functions
what is that
ok i dont want to do that
i thought conv backprop involved reversing it and then doing a convolution with a padded matrix
you can't avoid it. from this point on, it's ALL math. your only choice is how to do the math
idk, maybe that's the case for a single layer. the only way to show that is by doing what i told you
write it explicitly as a sum, differentiate that, and see what the resulting structure is
i'm not going to do it for you
i need like a step by step
https://towardsdatascience.com/backpropagation-in-a-convolutional-layer-24c8d64d8509 maybe check this out? i usually don't like their blogs due to poor quality control, but take a look
it's really a lot easier to notice that convolution is multiplication by a toeplitz matrix
this immediately tells you how to write the derivatives
my friend say Toeplitz is a made up word
you lost my attention
so if i make a computational graph
it will tell me if my backpropagation is working right
ima do that
can someone explain to me from a software engineering + deep learning perspective what is going on in these lines of code? py for inputs, targets in test_dataloader: outputs = model(inputs) # output of the model _, predictions = torch.max(outputs, 1) . what type of tensor should outputs be returning? what's the relevance of the _ . And I understand that predictions returns an array of the correct classifications, but how is that working from a coding perspective?
"_" means torch,max produces 2 values but only the second one is of importance
okay, but what data is in _
nothing
its the same as putting a variable name there and never using the variable
for inputs, targets in test_dataloader:: This line initiates a for loop that iterates over the batches of data in the test dataset. The test_dataloader object is likely an instance of the PyTorch DataLoader class, which provides an iterable interface to a dataset by batching and shuffling the data.
outputs = model(inputs): This line passes the input batch inputs through the trained machine learning model model to obtain the predicted outputs. The outputs tensor likely has the same shape as the targets tensor.
__ , predictions = torch.max(outputs, 1): This line uses the PyTorch torch.max() function to find the maximum value and corresponding index along the second dimension (i.e., the dimension corresponding to the number of classes) of the outputs tensor. The maximum value is discarded (indicated by the _ variable) and the predicted class labels are stored in the predictions tensor. The predictions tensor is likely a one-dimensional tensor with the same number of elements as the inputs and targets tensors.
so predictions != labels yeah? it's just the associated index for each of the maximum tensors? and the max tensors are the tensors returned from the outputs after they've been run in the model, so it's the most probably classification index yeah?
yes
The need to standardize X isn't accompanied by the need to standardize y, right?
Well is there ever a need to standardize y?
is this in response to something else?
Nah, it's just a question
you need to be consistent in how you represent everything in your ML code
I get that you should standardize independent variables where needed, for KNN for example, but should you standardize the dependent variable as well?
what do you mean by "standardize", in this context?
and what type of data are the independent and dependent variables?
x-mean/std
Idk, numbers? I don't have a particular example, I'm just thinking theoretically. Like, should I ever standardize the dependent variable
if your X data are numbers, one normalization technique is to rescale all the numbers so the maximum value is 1 and the minimum value is 0. and if your y data are also numbers, you could do the same thing (though the scale will be different). but you can't do that if your y data are categories, for example.
Are there any radius detection algorithms used for 3D images in python?
I get that. But what I don't get is, why would I want/need to normalize the y data in the first place? What will that achieve? Won't the results be the same as if I didn't do anything with the y data?
the same reasons as normalizing the x data.
But.. isn't the point of normalizing x data in preventing one feature from dominating the other? Whereas there's only 1 y variable, so.. there's nothing to dominate over, so it shouldn't matter?
can you guys recommend books for maths especially in machine learning?
Mathematics for machine learning
hey, my predictions from torch.max() is returning [3, 1, 2, 2, 1, 0, 1, 1, 0, 0, 1, 3, 0, 0, 0, 3, 3, 1, 1, 0, 1, 3, 1, 1, 2, 3, 1, 0, 1, 3, 2, 2]). My model is supposed to be a multilabel classification. I'm not sure how to return a label
looks like this is the second of the two return values we talked about last week. it represents the index of the maximum value, for each row
so it only gives you one per row. you have to know what class each number represents, perhaps by keeping it in a dictionary.
when you say "multilabel", you mean that each instance can have more than one label/class, yes?
I have 4 classes of eye diseases (indexes 0 -> 3). It should return only 1, but there are 4 possible labels
so that's what [3, 1, 2, 2, 1, 0, 1, 1, 0, 0, 1, 3, 0, 0, 0, 3, 3, 1, 1, 0, 1, 3, 1, 1, 2, 3, 1, 0, 1, 3, 2, 2] is
why are there so many values?
outputs is a tensor of whatever shape the model outputs (that's defined in the model itself). The _ signifies that you're not using the values of the outputs, just the indices (torch.max returns values and indices).
so predictions will be a tensor of indices corresponding to the maximum value of each row (since you specified dim as 1, which is the second argument to torch.max)
if you specified dim as 0, it would be the maximum value of each column
and so on
The reason you do that is because the model returns a batch of outputs, which is a tensor of size (batch_size, num_classes), so getting the maximum value of each row will give you the prediction of each batch
there's multiple labels since you have a batch size more than 1
so each label there corresponds to the input batch's samples
Hello! So for my particular project, I would need Anaconda version 3.9 however only Anaconda 3.10 is available. Is there a place where I can get version 3.9?
when you create a conda environment you can set it to use python 3.9
it would just be conda create -n env_name python=3.9
Gotcha, thanks!
the anaconda version only determines the python version of the base environment
guys, i wanna try code a bot that reppetetively trys claim an xbox gamertag until its claimed, dm me please.
that sounds like it would be a breach of the xbox ToS
which would break rule 5
!rule 5
5. Do not provide or request help on projects that may break laws, breach terms of services, or are malicious or inappropriate.
🤯
how could one plot an envelope like this?
the hilbert function from scipy only plots one for the maxima
ooooo wavy
I think you could take the negative of your function as it would flip it with respect to the x-axis, and then apply this helbert function and take its negative as well
that's a good shout
will try that!
!e
import numpy as np
from scipy.signal import hilbert
import matplotlib.pyplot as plt
fs = 1000
Nt = 50
t = np.arange(Nt)/fs
carrier = np.cos(2*np.pi*100*t)
envelope = np.cos(2*np.pi*10*t + 120*np.pi/180) +\
2*np.cos(np.pi*5*t + 31*np.pi/180) +\
5*np.cos(np.pi*17*t + 47*np.pi/180)
signal = carrier*envelope
estimated_envelope = np.abs(hilbert(signal))
plt.plot(t, signal)
plt.plot(t, estimated_envelope)
plt.plot(t, -estimated_envelope)
plt.legend(("signal","positive envelope","negative envelope"))
plt.savefig("BIG_OOF.png")
geez
@wooden sail :warning: Your 3.11 eval job timed out or ran out of memory.
[No output]
anybody know any good pretrained text classifiers?
beautiful
That's nice. Is this a property of trig functions ?
which one?
Symmetry of the envelope
no
well. it's a property of modulated carriers, and carriers are trig functions. so in that sense, sure.
but usually speaking of envelopes already implies you're working with modulated carriers
if you have a trigonometric func as a carrier, call it c(t), and an envelope e(t), then the signal e(t)c(t) has a symmetric envelope, since e(t) becomes the amplitude of the sinusoid in the carrier
Aah i see. Thanks!
made it a little better-looking
hey,
i wonder if AI requires lots of permanent memory to operate, like if the dataset is on a cloud , does it still requires a big hard drive to operate?
and how about memory? ... could a regular computer or an old device host something like chatgpt since all its dataSets are hosted online ?
I'm not aware of any. I can imagine a situation where this could be construed as reasonable—if you know that the missing part of the data is independent of the rest, and if that missing part is pretty tight around its mean (median, etc.). But in that case, why impute data? So I don't know of any situation where the procedure you describe is actually a good idea. (Besides which, my general advice for imputation is to avoid it when possible. It's very easy to make mistakes that affect your analysis.)
In the most general setting, this is really a property of things that you can take the Hilbert transform of. Suppose you have a function f(z) which is holomorphic in a neighborhood of the closure of the upper half plane. For a real number x, define g(x) = Re f(x). Assume g is continuously differentiable. Also assume that f satisfies a decay condition as |z| -> infty, e.g., f(z) = O(|z|^{-1}). Then, for x on the real axis, (Im f)(x) is the Hilbert transform of g(x). By combining g and its Hilbert transform, you can compute f on the real axis (general properties of holomorphic functions ensure there's a unique extension to a neighborhood of the closure of the upper half plane but say nothing about how to compute it). The reason why this gives you something that looks like an envelope is because cos x is the real part of e^{ix}, which has constant absolute value. That is, the Hilbert transform tells you how to fill in sinusoidal wiggles!
Could you give examples of these mistakes? Like including the target variable in the features used for imputation maybe?
The biggest problem is that whether or not you measured a data value may have something to do with that value.
For instance, I gave a time series example earlier where the sensor stopped working when the reading was too high.
realised I didn't even need to use the funky Hilbert function or anything else
as my function is bounded by a cos term 😐
so the max and min are just +/- of the other part of the function
Another case turns up when doing surveys. For example, imagine that you're doing political polls. You dial a number. They answer, but when you get halfway through the poll and ask them about a really sensitive topic, they hang up. Why? Ideally for the pollster, people hang up totally randomly, for reasons completely unconnected to their opinions. In reality, that's not true. Some people are more likely to answer polls than others. So when you have a missing data point—a person who hung up midway through—it's quite likely that imputing missing responses based on other people's responses will be wrong.
I'm using pyomo with the ipopt solver. I want to define a derived variable using an in/elif/else clause, or something similar that could produce the same results. If it didn't have to be compatible with the solver, I'd just use this python function:
def star_lum(mass):
"""
Calculates the luminosity of a star with the given mass.
"""
# Define the Mass-Luminosity Relation
if mass < 0.43:
luminosity = 0.23 * mass ** 2.3
elif mass < 2:
luminosity = mass ** 4
elif mass < 20:
luminosity = 1.5 * mass ** 3.5
else:
luminosity = 32000 * mass
return luminosity
But I can't include function calls in derived variable definitions.
I've looked into piecewise() but I don't think it'll actually work, (unless I'm misunderstanding it or doing it wrong.)
Guys, I'm beginning to write some research papers on AI and Deep Learning, and I wanted to know...can someone recommend me an app or another way to make sketches and schemes to illustrate concepts?
Making those at Paint 3D feels a bit too amateur...even for an amateur 
Ah yeah this is the missing not at random case right? Where we have to figure out the reason for the missingness in order to see if we can do imputation?
Make a plot with something like GNU Plot, export to SVG, and import into Inkscape.
Ugh...seems a bit complicated, but thanks!
At least Inkscape seems better than Paint 3D
In a Neural Network why do we use multiple nodes? if the nodes have the same activation function isnt it useless to connect every input to ever node because we would get the same value for that input from every node? is each node calculating something different?
Same activation function doesn't mean same output
If your first node is applying the operation 1x5, and your activation function is a ReLU, its output will be different from your second node that is applying the operation 5x-1 with ReLU function.
ReLU(1x5) = 5
ReLU(5 x -1) = 0
Howdy, y'all. I work primarily in the ds/ml/mle space right now and have Python as a daily driver. I've got some free time comin' up and was checking out some things I could learn. One of them is GoLang. So, here's a very soft question related to that.
Q: Does anyone have experience working with Go and Python together in the ds/ml/mle space? Are there common ds or data engineering problems that you feel Go is particularly good at either by itself or coupled with Python?
The Rectified Linear function is defined as:
f(x) = 0 if x <= 0 ; else f(x) = x
And the first node would execute the operation 1x5, while the second one, 5 x -1
So, you'd have:
1 x 5 (node 1) = 5 ----> ReLU(5) = 5 ---> 5 x -1 (node 2) ---> ReLU(-5) = 0
i had a question
this is a graph of training loss vs epocchs, as well as validation accuracy vs epochs
why does loss fluctuate so much
even though it gets lower over time and accuracy increases over time
It's difficult to tell why this would be the case just from looking at the graphs. The most common thing in my (limited) experience was a batch size which is too small or not representative of the population. tl;dr, maybe try increasing batch_size and see what happens.
ill try doing that
i think i get it
i had a 100000 data set
i made a mini sample
of 1,000 images
and ran the model on it
resulting in greater repetitiveness and therefore more fluctuation in training loss
if i trained the model on the cloud using the entire dset i think i would have less volatility
im not sure if that's the right thought process
that's just what i think is the case
I have not worked with NNs enough to know the ins-and-outs, and I do not usually work with image data. It's possible that the batches were all just very different from each other or something, I'm unsure.
it's specifically a property of sinusoids though. you can make a signal that does not have zero mean and you'll notice if it entirely above the x axis, the envelope is the signal itself. then it's no longer symmetric. it has the nice symmetry property particularly due to the fourier modulating property, if you wanna think about it that way. as you wrote as well, it's due to the exponential having modulus 1 everywhere
Guys, when it's preferable to use KL-Divergence rather than MSE Loss?
I don't really get the difference between them. They both look like loss functions that penalizes outputs too different from the target
I'm considering making a model to convert mel-spectrograms into waveforms. For that, I suppose I'd have to use one of those losses...
Uh...now that I think about it...I may have to use none of them at all, but actually something like gaussian log likelihood 
Still, the question remains. When should I use KLD, when MSE?
all cost functions do this
KLD measures the difference between two distributions. MLE measures the distance between estimates/data points (that were drawn from some distribution)
under special conditions, the two things are the same. in general, they aren't
Oooh, between distributions
So MSE would be more like element-wise operations? Like...trying to recompose an image, for example.
And KLD for probability distributions?
I see
Now it makes more sense
as i said, these are sometimes the same thing. it depends on how the probability of observing a specific sample depends on that sample
But then...gaussian likelihood loss is for probability distributions too, isn't it?
A probability distribution?
so, a gaussian pdf with fixed covariance is precisely a case where the two things are the same 😛
the distance between two gaussian distributions boils down to something proportional to the distance between their means
so the KLD and MSE both yield something that looks like least squares
Ok, so... the distance between two gaussian distributions would be...gaussian likelihood loss.
and it could also be measured by KLD.
If I sample a point from those two gaussian distributions...MSE Loss?

gaussian likelihood loss is a maximum likelihood estimation, it's not the distance between to gaussian distributions
it just happens to look identical

gaussian likelihood looks at the posterior probability of the parameters of one distribution, given observations of samples drawn from that same distribution
KLD needs samples from 2 distributions
the final expression looks identical, but the interpretation is different
so that maximizing the likelihood given some data is the same as minimizing the KLD between two distributions
(only for this special case)
So...if I want my model to receive a mel-spectrogram as input, and generate a waveform with the most likely data values...which loss should I use?
I'm thinking about a Variational AutoEncoder, where the Decoder generates an image based on the most likely value for each pixel.
But for the Decoder, it's used the gaussian likelihood loss.
Oooh, I think I'm getting it now.
I should probably make the model generate a probability distribution for each data point, sample from this distribution, and then apply a negative log likelihood, since the point sampled should be one with highest probability, right?
And to calibrate how my model will generate this probability distribution, I should probably generate a prbability distribution with the waveform of the original audio data, and apply KLD to compare the two distributions...I guess

Do you usually check for the p-value when checking for correlation ?
indepth tutorial of MIL? all i find is high level idea, or research paper
Hey Hey people Seth Here
I am trying some text binary classification with LSTM/GRU + GloVe
And I am wondering the following :
- Should I use dropout? If so how much (Read that it generally dampens the memory of LSTM)
- Probably underfitting since testing accuracy is either dropping or never rising, while training is almost always too high (e.g. 80% tr- 60% test). Is there a way to know if it is solely due to small dataset size ? (1700 sentences around 300-400 words)
Better training performance and worse test performance often suggests that you are overfitting instead of underfitting.
And this could indeed (partially) be caused by a small dataset
But also a too complex model
Thanks for answering !
Do you have any idea about the dropout ?
I havent found a consensus on that yet for lstm
Some suggest using it, some not in lstm specifically, so im going with trial an error for the time being
I guess text classification is more prone to overfitting when starting ...? 
I haven't really used lstm's specifically, but it seems that some say that it does make the model forget things that might be important, thus reducing performance. I would honestly just try it out, and see what the effect is.
Can dropout not be added in a different part of the model?
yeah thats what ive read so far
and ive been experimenting with and without to check the acc,precision,recall
But honestly I dont think I should trust results on small datasets
Isnt it only useful before making the predictions(for the train part) ?
That's right. And unless you have a pretty clear picture of why your data is missing, it's possible that there's some non-random reason why it went missing.
Wait so ever bode would be different? But who decides what function to put in the nodes?
Hello everyone could you pls suggest a best use case in educational sector to solve a problem using chatgpt
No, it applies more broadly than just to sinusoids. But while the thing I described is some sort of envelope, it may not reflect our intuitive sense of what an envelope ought to be. Plus, the conditions of the theorem that I'm invoking are a little delicate; for example, if your signal has a non-zero mean, then it doesn't decay fast enough for the theorem to apply. In fact the theorem doesn't apply to pure tones. For example, the real part of f(z) = ie^{z^2} is zero on the real axis, while the imaginary part blows up; clearly you can add this f(z) to a pure tone, so there's more than one holomorphic function which has a real part on the real axis which equals the pure tone; that is, without a decay condition, the analytic signal is not unique. Any time you use a Hilbert transform to construct an analytic signal, you're either making some kind of assumption that things behave nicely, or you're saying that the only analytic signals you care about are the ones where the imaginary part is determined by the Hilbert transform (an assumption that's fine in practice).
well yeah, i was talking only about the envelope as a curve modulating a sinusoid, which is the common definition. indeed the hilbert transform is a more general relationship between the real and imaginary parts of analytic functions on the upper half complex plane. just that you don't get the nice symmetry in general
Hardest part about having a bilingual education is half the terms come to mind in english and half in my language
Which at times can be confusing af
I guess my point is that the Hilbert transform leads to a reasonable definition of an envelope in many cases of interest, and it can be used to give a definition of an envelope that applies more widely, though perhaps with some loss of intuition and application.
absolutely. i was just addressing the question they asked about the upper and lower envelope being the same, not in general about analytic representations
and regarding the non-zero mean, i was handwavy there. i meant non-zero mean on the support of a function, and zero (not just the mean) everywhere else
then you get all the nice properties of square integrability to do the analysis in fourier domain
How does data augmentation work in Pytorch where the model sees a new dataset for each epoch?
in my current workflow here is my code in general:
# my custom dataset class
Class Eye:
....
my_transforms = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomRotation(degrees = 45),
transforms.RandomHorizontalFlip(p = 0.5),
transforms.RandomVerticalFlip(p = 0.5),
transforms.ToTensor(), # default normalization to 0-1 range
transforms.Normalize(mean = (0.5, ), std = (0.5, )) # greyscale images
])
# loading training dataset, val and loaders
train_dataset = Eye(image_dir = "...",
mask_dir = "...",
transform=my_transforms)
train_loader = DataLoader(
dataset=train_dataset,
batch_size=BS,
shuffle=True,
num_workers=2
)
train_losses, val_losses = [], []
for epoch in range(EPOCHS + 1):
train_loss = train(model, train_loader, optimizer, loss_fn, device)
val_loss = evaluate(model, val_loader, loss_fn, device)
What's the question?
The transform will be applied for each batch
No because the data is not stored. The data loader keeps loading in batches of images and then applying the transformation.
So each epoch will not have the same augmented images
oh so this is done automatically for each epoch?
just confused that the transforms arguement on the dataset class is used before the epoch loop
Well it's just a function that the dataloader uses to transform each batch it loads
Does sklearn.model_selection.cross_val_score shuffle before splitting? I thought it did, but.. just read the documentation, and it seems it doesn't. Why not? Wouldn't it make more sense if it did by default?
how hard would be to make a chess bot trained on some user who played around 20k blitz games and make it to play like he ( some opening, as close as possible style of managing his time while game and so on)
i am a little bit affraid that 20k is not enough data and after the opening he will starts to play a random moves just
this splits and shuffles
xtrain, xtest, ytrain, ytest = train_test_split(df['feature_1'], df['feature_2'], shuffle=True, test_size=0.2)
you can look it up as well https://scikit-learn.org/0.20/modules/generated/sklearn.model_selection.train_test_split.html
Yeah, I know, but that doesn't perform cross validation.. I know how to cross validate with shuffling, that's not what I'm asking. I'm asking why cross_val_score doesn't scuffle by default.. is there a reason for that? Wouldn't it make more sense if it did shuffle?
To make the results reproducible ig
Does it not have split in here? 
cross_val_score(estimator=model, X=X, y=y, scoring='r2', cv=KFold(shuffle=True))
Documentation seems to indicate it is just false at start
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html
Or am I understanding the wrong thing again ? 
scikit-learn
Examples using sklearn.model_selection.cross_val_score: Model selection with Probabilistic PCA and Factor Analysis (FA) Model selection with Probabilistic PCA and Factor Analysis (FA) Imputing miss...
I am really curious now
Also what type of model are you going for?
Hi guys, for datascience class I have to create a KNN algorithm with good parameters so I made a simple implementation and I get on average 97.5% accuracy.
Is there a way to improve a simple implementation of KNN ?
Implementation with no libraries
Hi, just curious if anyone uses some sort of experiment tracking with their personal projects?
e.g. mlflow, dvc
Yeah, that's correct. What I'm wondering is: why is it false by default? Shuffle is true by default in train_test_split, so why isn't it true by default for cross_validation? I just want to know the reasoning behind that choice.
If I didn't notice that it's false by default, I probably would've gone using CV, assuming that it shuffles by default (because.. I think that makes more sense), which could lead to problems, for polynomial models, for example
@queen cradle can you answer this question please? Because if we should do that then what would a good visualization of it be cause heatmaps wouldn't be appropriate i guess ?
indepth tutorial of Multiple instance learning(MIL)? all i find is high level idea, or research paper
The activation function for each node is kind of a hyperparameter, but the number in the node that will multiply the input is simply the weight of that node. And this number is defined upon initialization and optimized through training
Oh, and I just discovered that I was using a MSE Loss wrongly for my self-learning model.
The model outputs a probability distribution out of a softmax...and for measuring its consistency, I was applying MSE between 2 outputs of the same model(the model has dropout, so those outputs tend to be slightly different)
It seems KL-Divergence would be preferred, here? Or maybe cross entropy, since it's a classification task... 
I think I'll just test both
Though Pytorch's KLD seems a bit weird. I tend to get some problems with it...
i think someone here mentioned that pytorch's KLD applies softmax to the input you pass to it to guarantee it behaves like a pdf
Oh, it's pointwise

Does it change anything?
The KL-Divergence I used for my Variational AutoEncoder seems closer to what I want:
def kl_divergence(z, mu, std):
# Monte carlo KL divergence
# 1. define the first two probabilities (in this case Normal for both)
p = torch.distributions.Normal(torch.zeros_like(mu), torch.ones_like(std))
q = torch.distributions.Normal(mu, std)
# 2. get the probabilities from the equation
log_qzx = q.log_prob(z)
log_pz = p.log_prob(z)
# kl
kl = (log_qzx - log_pz)
kl = kl.sum(-1)
return kl
I remember that Pytorch's version didn't work well...but that could be just me doing something wrong...
that looks like the same as pytorch does
ah you don't have the product in front though
the pointwise part is that it does not add/integrate over all observable events
Oh...
And that product...is it mandatory in the function?
Should I apply kl = log_pz * (log_qzx - log_pz)?
Indeed...but still, I remember having some problems with it
Well, I guess I'll just go for it. If it messes up, I'll try the VAE version.

Bayesian clustering, is there such a thing?
Hm... Since I'm doing a classification task, using torch.distributions.Categorical seems to make more sense than using a Normal distribution.
Problem is... Pytorch doesn't have a rsample implemented for Categorical distribution.
Should my implementation be something like:
output = model(input)
output = torch.nn.functional.softmax(output, -1)
distribution = torch.distributions.Categorical(output)
rsample = output[:, -1] * distribution.sample()
?
The model output is supposed to be a probability distribution. And since it can be derived, then I suppose multiplying the distribution sample by the output argmax would allow for optimization, right?
Though it feels a bit redundant...I make a distribution out of my output, and then sample from this distribution using the same output... 
you shouldn't need to use torch distributions categorical there
applying a softmax at the last layer of your network already turns the output into probabilities
So, the softmax already turns it into a distribution?
So...the KLD would be KLD = outputA - outputB directly?
that depends on whether the kld will apply softmax again tbh, i don't recall if it does
some logs are missing there
That's the thing, I want to use a custom way to apply KLD, since Pytorch seems to be returning NaN
that's a good indicator you did something wrong 😛

Oh...
dist = torch.distributions.Categorical(out)
print(out[0])
print(dist.probs[0])
tensor([0.2728, 0.0915, 0.0055, 0.0100, 0.0943, 0.1920, 0.0499, 0.0216, 0.0858,
0.0016, 0.0006, 0.1745], device='cuda:0')
tensor([0.2728, 0.0915, 0.0055, 0.0100, 0.0943, 0.1920, 0.0499, 0.0216, 0.0858,
0.0016, 0.0006, 0.1745], device='cuda:0')
So there's no magic?
Oh yes...log can't be a negative value... 
So I should apply a ReLU to my output layer...I guess...or at least a modulus
How do y'all productionize your models (at work or for personal projects)?
I've been in the habit of sticking the model in a docker container with a small API that has a "predict" endpoint. This has some limitations but works "okay" for models which aren't getting passed a ton of data and/or which aren't near-real-time.
(This is also nice because if we want to A/B test models, we can split the data beforehand and send it to different containers!)
this book: "Make Your Own Neural Network-2016", is still good?
i dont work on ML at work but this is how intend to do it
for personal projects hosted on vercel, may just include the pickle version of model and directly use it within the app
not tested this tho but its what im currently working on
That sounds pretty similar to what I'm doin' and seems pretty okay. Never used vercel but it looks like it would work fine like that!
Hey hello guys I need a help for writing the code for detecting iris and pupil using daugmans algorithm and converting that iris from polar or circular form into cartesian form..... if any ans pls dm me....
for one project we deployed it using aws lambda
their serverless service
which was fine for this use case since the cold start issue didnt matter as much
if you do this, i recommend using the cpu version of pytorch since it made a huuuge difference (aws lambda is cpu based)
not sure if youve seen this but i highly recommend the FSDL course https://fullstackdeeplearning.com/course/2022/lecture-5-deployment/
this essentially makes it a (micro?)service?
that can be used by anything that has access to it
im trying to send 3 dif approaches but its keeps saying my files are queued
🕯️
tragic

broken rip
from the lecture
@cold osprey deploying it along with vercel in the web app server is the first approach i believe
the lambda approach would be the 3rd one
they call it model-as-a-service but its essentially a microservice
they each have their pros and cons. just be careful when scaling
(if applicable)
has there been any good models for production that can tag products with related labels?
I asked people at vue.ai and they asked me $40k dollars lol
Google Vision API is terrible too ...
nice thanks, i think 1st one makes more sense for a simple app kinda use case where the model inference time is not too long/computationally expensive
2nd one more for when the predictions are to be displayed in a dashboard too hence batch processing
3rd one is kinda for more live stuff?
real time i mean
its good for recommendation use cases
where you can do intense computations like neural collaborative filtering, etc. in a batch process (since they would take too long for real time inference), run them daily or weekly or etc. then load them up
when you need them
but yeah this works for stuff like personal websites for sure
no probs
btw
i wonder if netflix's is batch or
Like so many things, the answer is "it depends". Usually, the null hypothesis of such a test is that the correlation coefficient is zero. If you reject the null hypothesis, then you've found some kind of correlation in the data. If your test was based on Pearson's 𝜌, then you found evidence that the true value of 𝜌 is non-zero, so the covariance is non-zero; if your test was based on Kendall's 𝜏, then you found evidence that the true value of 𝜏 is non-zero, so the variables have some kind of order relationship; and so on.
Rejecting the null hypothesis implies that the two variables are not independent. This may be that your goal; it lets you go to your colleagues and say, "There's a relationship!" I have done this before—I was able to say, "These things you thought were probably correlated are provably correlated," and people liked that. But failing to reject the null hypothesis does not prove that the variables are independent. No test for correlation can prove independence. If you want to be confident that the rest of your analysis is sound, then you can't assume independence unless you have domain-specific knowledge that tells you the events ought to be independent. For that reason, I think hypothesis tests for correlation coefficients have limited utility. I think they're used more often than they should be.
batch would be offline in here. netflix also uses nearline + online. this article is from 2013. im sure theyve moved towards more of a streaming first/online infrastructure since then https://netflixtechblog.com/system-architectures-for-personalization-and-recommendation-e081aa94b5d8
nice

hi ,
im trying to self learn AI, working on understanding how to make a line that eventually converges the loss curve.
i find it a lot easier to use something like this which is part of the course
I wonder if there is a software or a way to visualize our output better than the matplotlib
while matplotlib gives the final output for the whole process the output from that course was updated after each Epochs, i feel more comfortable with that kind of interface if there's one, if not , its fine . but just ler me know if u know some application/interface that can visualize the graph better
U can track the loss/accuracy etc per epoch
Not sure e if that's whats ure asking
Bro I’m trying to figure out networkx right now too
Can Bayesian optimization pick the least correlated parameter sets from an insample test?
Hi, I need help with building OpenCV from source. Why? I got this error while trying to do cv2.imshow("image", image) in a project that then recommended building from source.
The error:
I tried to install the required packages using sudo apt-get update && sudo apt-get install libpangoft2-1.0-0 libtiff5 but they're already in latest version.
Note: I'm using Ubuntu 22.04
Hello guys.....need help in capitalizing duplicate letters in two given strings in python. For example in a string 'computer program' we have duplicates or repeated letters 'OMPR' ....I want the output as cOMPuteR PROgRaM
I am able to grab duplicates from the given string but couldn't understand how to capitalize them in the main string
So you're implying that they should be used when we want to prove dependance? Or it isn't worth it either?
There's probably a better approach but this should work: https://pastebin.com/iY72RVN3
Also, i dont think this is the correct channel for this question.
Pastebin
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
Sorry..... Which is the correct channel for such questions on python?
And thanks for the solution
The help channel is most suitable i believe.
does anyone know how to implement data augmentation techniques for 3D images in Pytorch?
class Eye(Dataset):
def __init__(self, image_dir, mask_dir, transform = None):
self.image_dir = image_dir
self.mask_dir = mask_dir
self.transform = transform
self.images = sorted(os.listdir(image_dir))
self.masks = sorted(os.listdir(mask_dir))
self.images.sort(key=lambda f: int(''.join(filter(str.isdigit, f))))
self.masks.sort(key=lambda f: int(''.join(filter(str.isdigit, f))))
def __len__(self):
if len(self.images) == len(self.masks):
return len(self.images)
def __getitem__(self, index):
img_path = os.path.join(self.image_dir, self.images[index])
mask_path = os.path.join(self.mask_dir, self.masks[index])
image = io.imread(img_path)
image = image.astype(np.float32)
mask = io.imread(mask_path)
mask = mask.astype(np.float32)
if self.transform:
image = self.transform(image)
mask = self.transform(mask)
return image, mask
my_transforms = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomRotation(degrees = 45),
transforms.RandomHorizontalFlip(p = 0.5),
transforms.RandomVerticalFlip(p = 0.5),
transforms.ToTensor(),
transforms.Normalize(mean = (0.5, ), std = (0.5, ))
])
train_dataset_DA = Eye(image_dir = image_path, mask_dir = mask_path, transform=my_transforms)
train_loader = ...
for x, y in train_loader:
print(y.shape)
I'm getting this error for the for loop
my dataset consists of 3D greyscale images of shape 32x256x256 (Depth x Height x Width)
I don't know how to do that but based on the error message maybe you could try passing your image as 32 images each of shape 256*256 and then you combine them back together. Not sure if this would give you the desired result though.
PIL doesn't support it, so I guess you'd have to avoid using ToPILImage somehow.
tried this but get this error message instead
it doesnt work with numy arrays
Yeah, it won't work directly with the image transform functions then
actually.. it's not clear to me what function fails here
I think its the next_data function
Try dropping ToTensor too, perhaps.
I get the dimensions but same error
because the thing is, the three image transforms you're using claim to work fine on tensors as well
it now works for some reason when I put ToTensor() first
ah, that makes sense, if these weren't tensors already then the image transforms might not work (the docs say they are for Pillow images or torch tensors)
my_transforms = transforms.Compose([
transforms.ToTensor(), # default normalization to 0-1 range
transforms.RandomRotation(degrees = 45),
transforms.RandomHorizontalFlip(p = 0.5),
transforms.RandomVerticalFlip(p = 0.5),
transforms.Normalize(mean = (0.5, ), std = (0.5, )) # greyscale images
])
so weird
If you want to prove dependence, then yes, these kinds of hypothesis tests are useful. As I said yesterday, I've used this kind of test for this purpose: There was an observation about our data that seemed intuitively plausible, and that people believed was true based on experience, but that people weren't sure how to prove statistically. We could have gotten along without proving it; but it made everyone more comfortable to be working with facts instead of feelings.
Thanks a lot. One last, somewhat unrelated, question. Do you know why we divide by the geometric mean in kendall's tau-b formula ?
really ? how ?
Off the top of my head, I don't remember. It's probably in Pratt and Gibbons, Concepts of Nonparametric Theory, but I don't have time to look it up just now.
Ok i'll have a look. Thanks a lot!
im trying to plot 6 things onto a figure using matplotlib
#Creating a matplotlib figure so all results displayed on one figure
fig = plt.figure(figsize=(11,9))
fig.set_tight_layout(True)
print('Plotting Chart')
#Positive Collocation table
plt.subplot(2, 3, 1)
plot_table(positive_reviews_with_collocations_sorted[:40], f"Frequency of co-occuring words with POS tags of {bigram_postags} in Positive Reviews")
plt.subplot(2, 3, 2)
plot_table(positive_reviews_with_collocations_no_pos_sorted[:40], "Frequency of co-occuring words without POS tags in Positive Reviews")
#Negative Collocation table
plt.subplot(2, 3, 3)
plot_table(negative_reviews_with_collocations_sorted[:40], f"Frequency of co-occuring words with POS tags of {bigram_postags} in Negative Reviews")
plt.subplot(2, 3, 4)
plot_table(negative_reviews_with_collocations_no_pos_sorted[:40], "Frequency of co-occuring words without POS tags in Negative Reviews")
this has worked for me in the past but i havent used tables before, seems to not work now. is there another solution?
Does anyone know how to solve this issue?
You should use \ or / instead.
Thank you for this. I just copyed from the properties description.
Instead of what you mean? I am using \
I dont why it was rendered that way but i meant 2 of this \ or 1 of this /
A single \ is used to escape characters like when you do "\n" for a newline. You should use 2 of it in order to escape it as well.
When using two \ I got PermissionError: [Errno 13]
It means you dont have permission to access that file.
ehm. I can open the excel file. Maybe change the location of the file?
I think if the file is already open in excel then you can open it from somewhere else.
what is the best way to work with huge-data sets? I have a data-set(12gb), I think its not the best way to download that locally and upload that to github.
At this moment, the excel file is not open but I can open it and have access to it.
When working with large datasets in Python, it's best to use libraries like Dask or Apache Spark, which allow you to process the data in a distributed manner across multiple machines
so like I can access to whole thing, in a server?
Are you sure that's the correct path. It doesnt have any extension at the end.
You`re saying something there. I am copying from this, but as you mention there is no extension at the end
Yeah apparently that's just the directory location. You add the filename at the end and use read_csv instead
What should I put between the last part and the filename?
Yes
2 of this \ like before
data = pd.read_csv?
Yeah
Thank you.
Appreciate your patient
@untold cliff You woulnd`t know why it doesnt come in that other format in the properties?
Why it doesnt include the filename ?
Yes and double \
guys how can I get rid of floating point imprecisions while calculating a polynomial's roots? I tried using the decimal dataclass but that didn't work
p = np.array([1, -5, 8, -4])
p_dec = [Decimal(str(coeff)) for coeff in p]
roots_dec = np.roots(p_dec)
print(roots_dec)
[2.00000006 1.99999994 1. ] <- it goes like this
No. I think the \ is windows specific though, dont really know why. And for not including the filename it might be somewhere in your settings to include the directory only
It's due to the implementation i guess but you can round the results to a suitable precision and it would be fine
alright thanks, I'll do that then
If the 1.99... should be 2 then it wouldnt work. Maybe you should set a tolerance like 1e-5 and check if the difference between the results and their rounded values is less than that tolerance. And then plug the roots back to the polynomial just to check
I just made the second part where I plug the values back in
finding roots of polynomials is a very nasty computational task. it's an ill conditioned problem
what method does numpy use?
probably newton methods
you'd have to make your own implementation based on the decimal library where you take numpy's solution and take a few newton steps yourself to get better precision
that still won't be exact
sounds like a lot of work
not terribly, but also it's only worth it in some cases. what are the roots being used for?
factorization
you'll be better off using CAS for that
why's that?
because you won't be able to find the exact roots using numerics except for very special cases
especially considering you can only represent rational numbers with a computer
Does anyone here have experience with using DeepSpeed or colossal AI?
Am i able to use these to reduce the VRAM requirements of my models by offloading them to NVME or RAM?
Is there any specific limitations to model architectures e.g transformers only, or can i use it for all model architectures?
How do I modify this code in order to remove the first column in the excel file that include the year?
try removing the yrea from your dataframe?
Explain please
remove the values for years in your data, or just unselect them before plotting
im assuming you don't want to plot the year in the graph?
hi
i am trying to understand the shapes of this graph
at the very bottom, i don't know where the 20 and the 650 comes from
hello guys could you help outa fellow
You have to ask a question
why my validation accuracy is low (50 percent) while training accuracy is over 95 percent?
you need to look at wherever you got the image from
I want to train a Neural Network using EfficientNetB2 and while the training phase is going well the validation is always on 50 percent. This is my code :
It's jsut the shape of the input to the LSTM
import numpy as np
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
from google.colab import drive
drive.mount('/content/gdrive')
!unzip /content/gdrive/MyDrive/chest-xray-pneumonia.zip
img = tf.keras.utils.image_dataset_from_directory(directory='/content/chest_xray/train')
img_val = tf.keras.utils.image_dataset_from_directory(directory='/content/chest_xray/val')
img_test = tf.keras.utils.image_dataset_from_directory(directory='/content/chest_xray/test')
just co confirm, by val accuracy, you mean validation and NOT test?
Data augmentation
IMG_SIZE = 128
data_augmentation = tf.keras.Sequential([
tf.keras.layers.Resizing(IMG_SIZE, IMG_SIZE),
tf.keras.layers.Rescaling(1./255),
tf.keras.layers.RandomTranslation(0.2,0.2),
tf.keras.layers.RandomRotation(0.3),
tf.keras.layers.RandomFlip("horizontal"),
tf.keras.layers.RandomZoom(0.2)
])
data_resize = tf.keras.Sequential([
tf.keras.layers.Resizing(IMG_SIZE, IMG_SIZE),
tf.keras.layers.Rescaling(1./255)
])
img_aug = img.map(
lambda x, y: (data_augmentation(x, training=True), y))
img_val_aug = img_val.map(
lambda x, y: (data_resize(x, training=True), y))
img_test_aug = img_test.map(
lambda x, y: (data_resize(x, training=True), y))
img_aug.shuffle(100)
model structure
base_model = tf.keras.applications.EfficientNetB2(input_shape=(128,128,3),
include_top=False,
weights='imagenet')
base_model.trainable = True
my_model = tf.keras.models.Sequential([
base_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(128 , activation='relu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(64 , activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(1 , activation='sigmoid')
])
my_model.summary()
Balancing Labels
normal_count = 1341
pneumonia_count = 3875
total_count = normal_count + pneumonia_count
normal_weight = (1 / normal_count) * (total_count) / 2.0
pneumonia_weight = (1 / pneumonia_count) * (total_count) / 2.0
class_weight = {0: normal_weight,
1: pneumonia_weight}
Compile and train
base_learning_rate = 0.001
my_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=base_learning_rate),
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=['accuracy'])
history = my_model.fit(img_aug,
epochs=5,
batch_size = 128,
class_weight=class_weight)
I tried different approach with importing the dataset but the result is always the same. according to this article Automated Diagnosis of Pneumonia from Classification of Chest X-Ray Im ages using EfficientNet I should get more than 90 percent of accuracy for validation and test sets.
ResearchGate
PDF | Pneumonia is a fatal contagious agent that causes respiratory disorders. The methodology utilized by an advisor to evaluate pneumonia through... | Find, read and cite all the research you need on ResearchGate
i didnt get any comments on stack
im assuming your code isn't from a linked github from the paper?
Can you explain what the 3 numbers mean
@visual violet Looking at the image, im guessing the 650 is the size of the hidedn layer, the 35 is the number of neurons,and the 20 is the batch size..?
i could be wrong though, been a long time since i last did anything with LSTMs
i have some data that ive cleaned (just removed and reaplced null values) but i wanna ask how do u know which columns need normalizing or is it good practice to always normalise?
How would I unselect them before plotting? I may want to use the years later and kinda pitty to remove them now
That sounds good. Thank you man
Also can you explain what hidden size means ?
Like the number of outputs ?
@wooden sail @subtle mural How do I unselect them before plotting this?
maybe just if country == "Year": continue
Please don't ping frequent answerers unless they've already started answering your specific question.
Okey. I am familiar that Edd is a genius so, I know he may know. Sorry for that.
What is this webpage that I can write the code and share it here?
correspondingNegativeReviewsQuantity.append(len(reviews_with_sentiment.loc[(reviews_with_sentiment['sentiment'] == 'Negative') & (reviews_with_sentiment['id'] == i)]))
how do i make things more readable cos this is getting silly XD
You can break down the code into multiple lines using line continuation with the backslash \ symbol.
example:
correspondingNegativeReviewsQuantity.append(
len(
reviews_with_sentiment.loc[ (reviews_with_sentiment['sentiment'] == 'Negative') & (reviews_with_sentiment['id'] == i)
]
)
)
Or split it up into multiple lines
cond1 = reviews_with_sentiment['sentiment'] == 'Negative'
cond2 = reviews_with_sentiment['id'] == i
correspondingNegativeReviewsQuantity.append(len(reviews_with_sentiment.loc[cond1 & cond2]))
hi! i'm trying to apply the function mapPartitions to a file with 111k rows in Google Colab but it got stuck there. Does anyone know how could i fix it? thanks in advance.
Try https://paste.pythondiscord.com/isonozepiy
Optimize your code by using efficient algorithms and data structures. If your code still runs slowly, consider breaking up the tasks into smaller, more manageable chunks.
how many partitions should i set for a file with 111k rows?
If the data is not evenly distributed across the partitions, some partitions may take longer to process than others, leading to performance issues
It depends on several factors such as the size of your cluster, the available resources, and the nature of your data and processing tasks. A good starting point is to set the number of partitions to be a multiple of the number of cores in your cluster. For example, if you have a 4-core cluster, you can set the number of partitions to 4, 8, 12, or any other multiple of 4.
okay tx! i think google colab has only 2 cores
i'll try w 2
So I want to make a layer of abstraction using a NLP for my database. Is there any good resources? This is what I currently have
(Posted also in #databases )
NLP is a concept. Programs that apply NLP are not called "an NLP".
You'll want to look into intent classification. You need a model that can take a user's command as text, and identify which in a specific set of interaction types the user wants to do. And which words are important for that specific interaction. So if you ask "what is John's phone number", the model needs to classify this as REQUEST_PHONE_NO and JOHN as the entity of interest.
Right, so currently I am using the spacy library and their NLP command does that (kinda) I wasn’t sure if there was a better resource to investigate on top of that
What is "their NLP command"?
import spacy
# Load English tokenizer, tagger, parser and NER
nlp = spacy.load("en_core_web_sm")
doc = nlp("Mr. Best flew to New York on Saturday morning.")
ents = list(doc.ents)
assert ents[0].label_ == "PERSON"
assert ents[0].text == "Mr. Best"
That's not "an NLP command". nlp is just the variable for the English model instance.
ah understand, sorry
currently I have my code setup to use the
doc.noun_chunks
to best match what item the person is looking for in the database
Anyway, you can use spacys entity recognition capabilities to identify relevant parts of the user input. But you still need to figure out what kind of information the user is asking for.
are there any librarys which help with that, or does that need to be coded on my own?
Learning the basics of spacy is probably ambitious enough for this project, so I would probably just use conditional logic
Because you wouldn't be able to train an intent classifier without learning the basics of model training. Which is an entirely separate concern from what library you use
okay! Thank you for the help! I appreciate it
You can put it inside parentheses instead of using \ cause anything in parentheses counts as a single expression i believe
are there automated scripts for finding white space areas in pictures?
hey,
i am self learning , i wanna ask a question about feature column
feature_columns = []
latitude = tf.feature_column.numeric_column("latitude")
feature_columns.append(latitude)
longitude = tf.feature_column.numeric_column("longitude")
feature_columns.append(longitude)
fp_feature_layer = layers.DenseFeatures(feature_columns) // ?????????????????????
def create_model(my_learning_rate, feature_layer):
"""Create and compile a simple linear regression model."""
# Most simple tf.keras models are sequential.
model = tf.keras.models.Sequential()
model.add(feature_layer) // ???????????????????????????????????????? (2)
model.add(tf.keras.layers.Dense(units=1, input_shape=(1,))) // ???????????????????????????????? (3)
model.compile(optimizer=tf.keras.optimizers.experimental.RMSprop(learning_rate=my_learning_rate),
loss="mean_squared_error",
metrics=[tf.keras.metrics.RootMeanSquaredError()])
return model
is there is a reason for adding fp_feature_layer alone to the model ?
I thought all dense layers are added on (3) why did we add feature_layer alone???
NVM , chat gpt got it LOL
mind blowing
Finding polynomial roots is a long-standing problem that has been the object of much research throughout history. A testament to this is that up until the 19th century, algebra meant essentially theory of polynomial equations.
In mathematics, a square-free polynomial is a polynomial defined over a field (or more generally, an integral domain) that does not have as a divisor any square of a non-constant polynomial. A univariate polynomial is square free if and only if it has no multiple root in an algebraically closed field containing its coefficients. This motivates t...
hey thanks for the reply, I switched to sympy in the end
Sympy is a good choice.
It does these kinds of algorithms.
now I'm kind of figuring out how to factorize properly while dealing with polys being strings
I had a chance to look. It's not in Pratt and Gibbons, but actually there's an explanation on Wikipedia under https://en.wikipedia.org/wiki/Rank_correlation#General_correlation_coefficient. The point is that if you look at it the right way, correlation coefficients are the cosine of an angle. The denominator is actually the usual normalization by the length of a vector (in this case, the lengths are really the Frobenius norm of certain matrices).
Thanks for the help!
oh god, I remember a huge debate about using kendall vs spearman correlation on a project I was on
worse, it the people kept bringing it up again and again
Good afternoon
Maybe this if you dont care about order: py from itertools import count x = [ 0, 0, 0, 2, 50, 50, 80, 99, 998, 998, 998 ] mapping = { element: code for element, code in zip(set(x), count()) } unique = [mapping[i] for i in x] print(unique)
Can someone help me?
I can't read a huge csv. Not even filtering only 2 columns
I dont know about pytorch. How about numpy ?
np.unique has a return_index argument, if set to true, it would return the first index of each unique element which would be fine i guess
What is the problem exactly ?
anyone know how to sort by length and then between the elements with same length sort by x[1]?
Slight rant so sorry and yes i want answers but why do people try to modify a deep learning model then ask the most rudimentary question possible i.e. "how to increase context window from 2048 to large number"
@charred egret sorry, return_inverse is more appropriate i believe
Good luck 😀
(len(x), x[1]) ?
already tried that
This is weird
^ Why are there tuples of strings sometimes and sometimes ints?
because it's frequent items
9 is a frequent item but 7,18 too
Is this what you've tried so far ?
How come you didnt get an error
When you do len(x[0]) your sometimes doing len on ints which should give a typeerror
uh, so i dont get any error
Yeah, I cannot reproduce this.
But I think first, you shouldn't be mixing tuples of str[int]'s and ints. That seems like a pattern that will not make things easy to work with in your data.
Is there a chance this is a custom output and not just print(sequence)
so if i do sort only by len it works but then the second column is not sorted
uh no
it's not working well neither
If your elements are all strings then it makes sense
You're not getting an error for len(x[0]) because they are strings, and from 9 to 0 they are all smaller because they're just 1 characters, and then you get 2 character string like 11 and 10, and also tuples which are of length 2
what can i do to fix it
Can you give me an example of what your desired output would be first
first singleton, then double, then triplets, and each group sorted by x[1] ASC
What library is this in?
pyspark
@red rose try lambda x: str(x).count(','), x[1]
How would you do it in pandas
yh now it's working!!! thank you so much ❤️
Assuming its a pd series and each element is a list or a standard numbrr sort_by(key= lambda x: len(x) if x is list else 0)
Would be the first thing to come to mind
Thanks! Btw i think it should be isinstance(x, list) (or even better isinstance(x, collections.abc.Iterable) to include tuples and other iterables)
Works too
This pic demonstrates how the order (degree) of polynomial features affects the MSE of a function. So it gets better, sweet spot, then worse.
So my question is, is it usually like that (better->sweet spot->worse) when tuning hyperparameters? Cus if yes, then wouldn't it be cool if instead of an exhaustive GridSearchCV, or a random RandomSearchCV you could do something like a RandomSearchCV, but with logic to get the best result? For example, 3 random points, create a curve, bottom point, adjust the curve, bottom point, etc., not just choosing parameters randomly? Or would that not work, because not all hyperparameters follow the better->sweet spot->worse rule, and some just produce entirely random, unpredictable results, meaning some random combination might just work better, meaning the only way to be sure u got the best params is a grid search?
When tuning parameters it's done on the validation data or the training data using cross validation not on the test data. If you try the way you want i think you would be overfitting for the test data.
Good to know. Can you share the solution with us?
