#data-science-and-ml
1 messages Β· Page 375 of 1
well, here's your chance to permanently stop π
probably should
"array" means something in math, but then some programming languages use it to refer to data structures in that language. In Python data science, "array" refers to a data structure intended to represent mathematical arrays.
in a lot of languages, what they call an array is the go-to data structure for having things in a certain order, whatever they are.
and in python, you use a list for that
but they are not the same.
I have been deceived
you can use numpy to turn a list into an array tho, correct?
yes, in the sense that you can pass a list to the np.array constructor
right
So to mention normal lists/arrays, we call them lists and we use array for the math definition?
in Python, you have lists for general-purpose containment of things. But if you're doing data science, you'll import numpy and do all the math with arrays.
they both use square brackets in their notation/representation and contain things in specific orders. that's pretty much where the similarities end.
granted, lists and arrays are more closely related than like, potatoes and frogs
but thinking of them as similar when you're dealing with programming/data science will probably just make things worse. and if people are constantly having to verify if you used the right term when you say list or array, you'll waste all your time.
I see, now time to head to google for a sec to ask what is an array in math
n u m b e r s
here are some two-dimensional arrays, written in math notation
two-dimensional arrays are also called matrices.
ah yes numpy
if you were using numpy, the first array would be represented like this
[[1, 2, 3]
[4, 5, 6]]
with the square brackets reminding you of how arrays are represented in math notation.
scalars are sometimes also treated as some sort of 0D arrays?
(e.g., you can do np.float64(123)[True])
but the whole array is "one thing". it's not a "nested array". the horizontal [1, 2, 3] isn't treated differently than the vertical [1, 4]
is this how matmul works?
just asking because seen it before
yes, I just picked this diagram arbitrarily because it has arrays, but it's demonstrating the matmul formula.
oh that's neat, I was actually a bit confused on how that worked
mind if i save it?
don't ask me, I just plucked it from Google π
xd
but you can save anything I say in this server, unless I'm confessing to a crime.
Hmmm
welp those tips are actually useful, I probably need to write them up so I don't forget it just in case
Thanks for the tips mate! Wish me luck because I am honestly only decent at math
I've completed the basic python knowledge and want to learn data analysis, can anyone suggest me some useful courses or resources?
!resources
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
filter these by data science.
also, I see you asked the same question in #pedagogy. that is not the topic of that channel.
@serene scaffold So, anaconda is basically used as a package manager for python where we have different tools like spyder, jupyter etc? And, where do you do the coding part if not on jupyter ntbk ?
IDEs like spyder or pycharm are not the execution environment for the program. They just edit the code as a text file and give you tools for working with them. Whereas jupyter notebooks are both the editor and the executor.
And, where do you do the coding part if not on jupyter ntbk ?
Not to pick on you, but the fact that this is a question just goes to show how general programming knowledge isn't taught in data science resources.
so, where?
wait do you mean vs code?
outside of jupyter notebooks, you can edit python programs with any editor and pas the file to the python executable.
I'm confused, I thought we could execute/run files in spyder too.
vs code is another text editor with features to help you program. but it's not what runs the code.
its connects to python.
spyder might have a button to run a program (I've never used spyder), but that button sends the program to the Python interpreter, which exists separately from Spyder.
ohkay and that python interpreter exists in jupyter?
pretty much. (jupyter itself is actually a python program, but let's not get into that.)
Idle ?
no one uses idle except for learning
keep in mind: programs are just text
you can use any text editor.
jupyter is doing more than just editing the text.
jupyterlab might have more features
its executing too?
yes, and visualizing
how can i restrict my epoch in fitting in keras?
i mean how can i let it print every 10th epoch?
suppose you develop a model that a business wants to use. they can't just take your notebook and import it into their system
notebooks are a "dead end", in that sense.
just keep the history and parse it after ?
https://stackoverflow.com/questions/44931689/how-to-disable-printing-reports-after-each-epoch-in-keras
this answer shows something like
verbose=1 if epoch % 10 == 0 else 0
but I am lost about like what in the world is epoch reffered to here? as in where do we define it and how do we execute it.
so I'm not supposed to use jupyter notebook since I can't call import on seperate files and I can't deploy models which I train using code written on jupyter. Am I right?
when you write code in a notebook, you're writing it so that you can see the result after each cell. you're not writing it so that it can be used outside of the notebook.
oh okay, so jupyter would make sense for learning / teaching
jupyter is fine for quick experimentation and visualization
by reading those comments it says you just define "verbose" that way
Gotcha and people across different systems wouldn't be able to access my notebooks, so I should use something like vs code
i've never tried it. im curious, so im going to
jesus
verbose takes more than binary ! cool!
Hello everyone
how does root mean squared error punishes outliers? is it always better than mean absolute error?
Can anybody tell me what do u mean by 'regularize the data column wise'
ya i'd just go verbose off , record the output and parse it
better yet, graph it
i'll need to see how to do that afterwords. but thanks for help:D
theres a bunch of ways... sklearn i might have a link to an example somewhere
?
if youre developing something, its nice to have html docs along any presentation. jupyter can make reproducibility requirements easy for non programmers to understand
when you take the square of something, it gets bigger. so if you take the square of all the errors, then the bigger errors get "more big" than the smaller errors
"better" in what sense? for what purpose? in the presence of outliers we say that methods based on squared errors are not robust, in that a small number of extreme outliers can significantly change the results
"column-wise" means "individually for each column". so regularize each column of your data
How do the html docs help? Do they contain description about the code?
If the bigger error gets more big then it will increase the overall RMSE more how is that punishing?
import matplotlib.pyplot as plt
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from keras.models import Sequential # initialize neural network library
from keras.layers import Dense # build our layers library
def build_classifier():
classifier = Sequential() # initialize neural network
classifier.add(Dense(units = 8, kernel_initializer = 'uniform', activation = 'relu', input_dim = x_train.shape[1]))
classifier.add(Dense(units = 8, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
return classifier
classifier = KerasClassifier(build_fn = build_classifier, epochs = 70,batch_size=10)
history = classifier.fit(x_test, y_test, validation_split=0.20, epochs=70, batch_size=100, verbose=False)
# Plot training & validation accuracy values
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()
# Plot training & validation loss values
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()
This might help. data is the classic mushroom set
yeah i was just messing with history.History.
Better in sense that lesser error is what we are after right we need to minimize the error
But could you tell me how, approach
k good cause that def has typos and problems ...
yes, thats one of the big selling points about jupyter. you use markdown with it. so you can walk them through with pictures and links and happiness
If i go for lasso then it removes those which it doesn't need and shrinks to 0
html is just a static form to simplify distribution
@serene scaffold one more thing. I knew that anaconda is used as something as a package manager and that creates a specific conda system where all the tools reside. So instead, I should not use anaconda but learn venv for making virtual environments and then install tools as per my need. Is my assumption correct?
because models usually look for minimum error. so bigger error is "bad" from the perspective of the model fitting process, so you might end up with a model that "focuses" on trying to avoid those large errors, at the expense of focusing on other things.
How to go column wise I am not getting it
Oof, I totally forgot that the text between code cells in the notebooks is html, forgive me lmao
you can export the whole document as html
cool
does anyone know any popular deconvolution algorithms in python for microscopy images (especially with confocal microscopy images)?
skimage ?
I didn't really encounter the .evaluate() method until I started learning Deep Learning. So I also remember asking myself this same question π
So yeah, it can be likened to .predict() method in Sklearn
hm history came as handy
history.history['accuracy'][::10]
is good enough
thanks , ill use that
Gradio can be used to demo and deploy a machine learning model right inside a JNB. I've tried it before and surprisingly it worked. Although I always use VSCode when it comes to deploying a model.
yes
this makes me very sad.
i don't know if i agree with the recommendation to avoid anaconda entirely. i think if you want to use it, you need to learn how it works; much like venv. however i also don't know what the specific context for your question was
I have 100+ features in my data and I need to find data points that are closer. I tried Euclidean, but read that it doesnt work well in higher dimensions. Are other methods like distance correlation/ clustering better?
closer than what? to what?
are you looking for k-nearest neighbors?
unfortunately distance-based techniques generally don't work well in high dimensions. you might want to try some dimension reduction, maybe something "classical" like PCA, or a lower-dimensional vector embedding (e.g. with an autoencoder)
also: what kinds of features are these?
This is a patient health EHR dataset. I have a test patient and I need to find the list of patients from the training data who are closer to this test patient and then impute missing values
try cosine distance
Sure, will try that. I thought cosine also has curse of dimensionality though
is supposed to work better than euclidean on high dimensions
ok, thank you @novel elbow
i'd be skeptical of cosine distance on strongly-heterogeneous data, especially since some of these features might be categorical
that said, i'd be skeptical of all pairwise distances in that context, without first doing some dimension reduction
you'd at least need to be very careful to scale the features properly
also not sure if it matters, but cosine "distance" (1 - cosine similarity) is not a valid distance metric
@desert oar , so we need to do PCA first and then try distance metrics?
not necessarily PCA, but it might be a good idea
you can try it both ways of course!
actually that would be a good exercise imo
sure, thanks a lot!
This is how it looked like on JNB
what language is Ndewo. Igbo?
Yeah π
I'm so fucking cultured.
Hahaha do you have Igbo coworkers? π Or it's just from movies?
Neither, I'm a c o m p u t a t i o n a l l i n g u i s t
π Awesome. I'm impressed π―
some of the implementation details are the same, and they use some of the same concepts, but they're different.
sets are pretty underrated.
though this isn't a data science question. if you open a help channel (see #βο½how-to-get-help) I'll go over it with you briefly.
Oh wait damn
I thought that I am at general
Lol
@terse swallow go to #help-potato before someone else takes it
shouldn't a bigger dataset give better results?
1500 gives me a lower loss function that 4000
assuming that all datasets are of the same quality, more data is either good or unnecessary
ok that answer my problem thanks.
conda-forge is returning server error code 403 when i try to install packages ... anybody else having this issue?
when an RNN backpropagates through time, the weights get trained to more accurately take in a hidden state and an input and produce the correct output vector. This I understand. What I don't understand is why the weights for how the hidden state gets calculated are changed
there are two neural nets in this RNN:
self.in2hidden = nn.Linear(input_size + hidden_size, hidden_size)
self.in2output = nn.Linear(input_size + hidden_size, output_size)```
the first one gets better at producing the correct output when it backpropagates
are you taking increasingly large samples of the same dataset? or are these different datasets totally different from each other? are they simulated datasets, or real data?
but how is the loss even calculated for the second one? what is loss for a hidden state?
if these are random samples, i am skeptical that the differences between the 1500, 2000, 2500, and 4000 runs are meaningful. it might be random sampling variation. i.e. the "variance" in the "bias-variance" tradeoff
normally loss is the mean squared error between the guess and the correct answer
but there is no "correct hidden state"
so how would you even calculate it
presumably it's propagating backwards from the actual labels on the data
that's how all neural networks work
conceptually, at least
information propagates backwards from the loss function
if I have a dataset with 3 words, and the output looks like this
[1,0,0]```
but the correct word was
```py
[0,0,1]```
then the mean squared error between those is the loss
but with the hidden state neural net
its output is a hidden state
let's say it has 6 neurons
then the hidden state is like this
[0.5,0.5,0.5,0.5,0.5,0.5]```
but when it back propagates, what does it compare that hidden state to?
is the hidden state an output?
the output is the predicted word
maybe in your case the hidden state maps 1:1 with the output
if we consider an example with this dataset
["alice", "saw", "bob"]```
and 6 hidden layer neurons, then the first thing that happens is the alice vector gets appended to the hidden state
```py
[1,0,0,0.5,0.5,0.5,0.5,0.5,0.5]```
backprop starts at the loss L which is a function of y and the model parameters
we pass that vector of length 9 to the network self.in2output
there's a nice explanation of "backpropagation through time" in RNNs here btw https://mmuratarat.github.io/2019-02-07/bptt-of-rnn
and it predicts "alice" is the next word, which is wrong
it should say "saw" is the next word
then we calculate the error and backpropagate
right
but we also have the next hidden state
we pass that 9 vector to self.in2hidden
and it outputs a 6 vector, the new hidden state
but I checked and the weights of self.in2hidden are updating after each backpropagation
yep, that propagates from the loss function as well
so if self.in2output gets better at predicting the correct output
what does self.in2hidden get better at doing?
why change the weights?
and how does it change them, if the loss is based on the mse of the output
can you show the full model you wrote?
yes
ok so I just skimmed the link you sent, and this makes it clear that the total loss at the end of the sentence is in fact used to update the weights of self.in2hidden
ahhh yeah
but now my question is why
that was the confusion
n,n+1 vs n-1,n
the loss is computed on the next step
using the next hidden state
i've got this
for sentence in dataset:
hidden_state = model.init_hidden()
input_tensor = get_one_hot_sentence_tensor(sentence)
loss = 0
index = 0
for word in input_tensor:
if index+1 >= len(input_tensor):
break
output, hidden_state = model(word, hidden_state)
current_loss = criterion(output, input_tensor[index+1])
loss += current_loss
index += 1
optimizer.zero_grad()
loss.backward()
nn.utils.clip_grad_norm_(model.parameters(), 1)
optimizer.step()```
which adds the loss at each word
and then backpropagates at the end of the sentence
but my confusion is this:
self.in2output depends on the hidden state to make predictions
if you change the weights of self.in2hidden, doesn't that screw up the changes you just made to self.in2output?
let me look over this. but fwiw pytorch does have a built-in rnn module https://pytorch.org/docs/stable/generated/torch.nn.RNN.html
are you following this guide? https://jaketae.github.io/study/pytorch-rnn/ your code uses the same names π
In this post, weβll take a look at RNNs, or recurrent neural networks, and attempt to implement parts of it in scratch through PyTorch. Yes, itβs not entirely from scratch in the sense that weβre still relying on PyTorch autograd to compute gradients and implement backprop, but I still think there are valuable insights we can glean from this imp...
yeah I wanted to build a vanilla RNN to help me better understand the difference between RNNs, LSTMs and transformers, so I followed that guide and added some of my own functions and datasets. so far it's been really successful
Hi everyone! Could you explain to me what a Torch.gradient is?
the only part I still don't fully understand is the BPTT
are you confused about what a gradient is or what pytorch's implementation of it is?
this code uses the previous hidden state to compute the current hidden state. so this model computes loss at time t and information flows backward to hidden state at time t-1
the blog post computes loss at time t+1 and information flows backward to hidden state at time t
I'm confused about what a gradient is
when a neural net learns, it makes a guess. the computer then checks that guess against what the correct answer was
how badly it guessed is called the loss
you then use calculus (partial derivatives) to calculate the gradient
which is how much to change the weights so that you get the right answer
oh that's confusing
@plush jungle
def forward(self, x, hidden_state):
combined = torch.cat((x, hidden_state), 1)
hidden = torch.sigmoid(self.in2hidden(combined))
output = self.in2output(combined)
return output, hidden
the naming in this code is confusing you. here is the same code with better variable names:
def forward(self, curr_x, prev_hidden):
curr_combined = torch.cat((curr_x, prev_hidden), 1)
curr_hidden = torch.sigmoid(self.in2hidden(curr_combined)
curr_output = self.in2output(curr_combined)
return curr_output, curr_hidden
So we set the "required_grad" to True when we want to make a neural net
Hmmmmmmm
i recommend that you focus on understanding the equations first, at least on a conceptual level, before getting too deep into the technical details of pytorch
otherwise you will just confuse yourself and get lost in all the details
Where can I study them? Cuz I tried to look at the documentation and it seems pretty messy to me
maybe I would understand better if I knew the order that the back propagation updates the weights in? does it update all of in2oput and then all of in2hidden?
what type of neural net are you trying to learn about? the structure of a CNN, an RNN, and a GAN are pretty different
each of the arrows in this diagram is an adjustment of a weight matrix
but does the order matter at all? wouldn't you need to recalculate the loss every time you change the hidden state weight matrix?
yeah the documentation is written for people who already know the math. i think a course in deep learning would be a good resource. e.g. fast.ai or the andrew ng course, maybe some universities also have courses, maybe MIT?
using the symbols from that post, it updates Whh and Wyh simultaneously
actually wait, no
i think you'd say it updates Wyh "first", because that's "after" Whh in the flow from input to output
basically you have to look at the chain rule and go from outside in
i don't know that it's helpful to think of it this way
because mathematically i don't think it matters
the output is computed first. that's the forward pass
in both my code and the blog post it goes all the way to the end and adds the loss right?
then the "backpropagation" stuff is just a metaphor for computing the gradient and updating all the weights at once
well yeah, but it's based on the previous hidden state
so it pulls in both Wyh and Whh
but the important point is (and the answer to your question): gradient descent updates all the weights at once as a single vector
other optimization algorithms like coordinate descent actually update individual weights in a cycle
oh!
which works great for certain problems, but not for deep learning
backprop is a metaphor, there is no actual "flow" in real-time. the "flow" is computed ahead-of-time as a single expression of the gradient vector
wait, does it calculate the gradients seperately and then apply them at the same time?
well there's one "gradient" - the vector of partial derivatives
but each weight should have its own gradient right? because each weight needs to change by a different amount
each weight has its own partial derivative
the vector of all those partial derivatives is the gradient
oh I see what you mean
when I said individual gradients I meant the elements of that vector
so "gradient descent" look at the entire gradient as a single vector, which is often drawn as an arrow, to suggest the presence of both a magnitude and a direction
so the weight update (the "step") in gradient descent is 1 step in the direction of the minimum loss
imagine you have only 2 weights
gradient descent would be a step in any direction on the (x,y) plane
coordinate descent would be stepping only in the x direction first, then in the y direction, over and over
the difference with deep learning is that it's not just (x,y), it's a huuuge vector of every single weight in the whole model
it quickly becomes intractable to try and reason about that kind of a space even in abstract terms
which is why it's so nice that we have things like gradient descent
but in your 2D example
if we didn't have activation functions, or if we had a linear activation function, our weights and biases would be a line?
linear regression
and it's stuff like relu and sigmoid that make it a curve?
precisely! we use these nonlinear activation functions deliberately in order to introduce non-linearities in the model. so a neural network becomes a huge stack of these little nonlinear mini-models
back in the 1950s i guess they thought we could use that to model the human brain. turns out that wasn't true at all, but hopefully you can see how lots of little nonlinear things all interacting can lead to very very complicated emergent behavior
and yes, you can express linear regression as a neural network with no hidden layers and linear activation function
so could you say
that when the hidden state weights update
they're getting better at accurately storing the right patterns?
whereas the in2ouput weights are getting better at interpreting those patterns?
i wouldn't go that far with it. i'd say that the weights collectively represent some kind of encoded information about the training data
maybe you can say that different groups of those weights represent different kinds of information
and yeah, i guess you can all that "storage"
Wyh is a transformation from hidden state to output. Whh is a transformation between hidden states. so yeah, those two groups of weights probably encode/store different things
generated randomly
i think you are just seeing random variation between runs
So basically my test is useless?
not entirely useless, but you can't distinguish "results" from "random sampling variation"
for a better test, i recommend the following:
- generate the N = 4000 dataset
- generate the N < 4000 datasets by taking samples of the N = 4000 dataset
this way you are at least using the same data at each run
great idea thanks
you also might want to re-run the entire procedure several times in order to estimate the variances
e.g. for every run of (1), run (2) several times
then re-run the entire 1-2 procedure several times
yeah this is what I'm not getting conceptually. The information encoded in wyh is clearly the patterns in the dataset, like syntax, semantics, etc. everything I've seen about whh just says it's the "memory" of the RNN, which means it encodes all words that have been seen before
but if it's just memory, why would you need weights?
it should always be the same, and never change
i don't know if that's right
the hidden state itself encodes the syntax and semantics
or at least some abstract representation thereof
Wyh turns that abstract hidden state into a real word
the hidden state is like an abstract representation of "where" you are in the sequence
Wyh just turns that abstract representation into an actual element of the sequence
Whh tells you how to transition between these abstract positions in the sequence, the "hidden states"
(hopefully this also helps elucidate why RNNs can't really model the full range of natural language spoken by real humans)
in this code, the hidden state is being reinitialized at the beginning of every forward pass
for sentence in dataset:
hidden_state = model.init_hidden()
input_tensor = get_one_hot_sentence_tensor(sentence)
loss = 0
index = 0
for word in input_tensor:
if index+1 >= len(input_tensor):
break
output, hidden_state = model(word, hidden_state)
current_loss = criterion(output, input_tensor[index+1])
loss += current_loss
index += 1
optimizer.zero_grad()
loss.backward()
nn.utils.clip_grad_norm_(model.parameters(), 1)
optimizer.step()```
wouldn't that undo all the training?
why back propagate the hidden state if you're just gonna reset it
it's being re-initialized at the start of every individual sequence
right sorry
otherwise you treat your dataset of sentences all as one big long sentence
right. you build up the loss and gradient for that one example by stepping through one word at a time, starting from the initial hidden state
oh, i see
don't forget, this is stochastic gradient descent
we do one weight update for each data point
there is no batching
the hidden state is not itself a learned weight
we aren't re-initializing the weights
we pick a sentence, step through it computing the gradient and loss, then do one weight update. then re-initialize the hidden state and repeat
yep, exactly
in an image recognition neural net, the hidden layer represents sub patterns that it's found in the dataset
Dunno, I just started learning about Pytorch tbh
I'll check it, thanks!
machine learning is super different depending on what you're trying to accomplish
image recognition, natural language processing, text/video/image generation
what interests you the most?
learn about the specific applications before learning about pytorch in general
the other way around would be like learning what a screwdriver is before learning about screws
but you can't go wrong with something like this video
https://www.youtube.com/watch?v=aircAruvnKk
What are the neurons, why are there layers, and what is the math underlying it?
Help fund future projects: https://www.patreon.com/3blue1brown
Written/interactive form of this series: https://www.3blue1brown.com/topics/neural-networks
Additional funding for this project provided by Amplify Partners
Typo correction: At 14 minutes 45 seconds, th...
Text/video and image generation in the first place
That makes sense ngl
text/video generation is done using RNNs, LSTMs, or transformers, because there is sequence data
image generation is done using GANs
In order to study Image/video generation I also have to know about OpenCV?
OpenCV is a screwdriver
machine learning models like transformers or GANs are screws
which interests you more, text/video, or image generation?
I think image generation
ok, then you want to look into GANs
I would like to look at them all btw
well start with image classifier neural nets
since a GAN is just a classifier with extra steps
For image recognition what do I have to know instead?
The transitions between hidden states represent semantic transitions within the sentence.
But keep in mind that this is just a model
are you familiar with the MNIST dataset?
Like I said, actual natural language spoken by humans is generally a lot more sophisticated than a sequence of "states"
Nnnope
it's thousands of images of handwritten digits 0-9. it's very commonly used to test image classification neural nets
in order to understand GANs and image generation, you should try to code an MNIST classifier first to understand classifiers
there are a bunch of tutorials for MNIST
Oh ok, so in order to know about machine learning I firstly have to study the mathematics and MNIST/GAN, right?
machine learning is a super broad category. GANs are the state of the art in image generation. to understand GANs you need to understand classifiers
tbh the math isn't super difficult unless you're an actual data scientist building your own model
try following this tutorial
and watch the 3blue1brown video on neural nets
Got it! Thanks really much!
feel free to pm me if you get stuck or have questions
It's important to note that this is a model trying to predict a specific label, it's not generative, and so it may not fully model the sequences, only what it needs to correctly predict the labels, which could be way less than the total. Ofc, this has the upside that it needs less weights and stuff and you can get away with a simple RNN, but if you wanted something to actually model the sequences (in totality) it would need to be generative / capture everything. @plush jungle
what I'm still struggling with is why have 2 neural nets? could you build an RNN where the Whh weights don't get updated?
(It also means that it does not generalize as well)
effectively creating 1 neural net
What do you mean 2 neural networks?
you don't have 2 neural networks. you have 2 different weight matrices. that's all a nn.Linear represents: a matrix of weights, a "linear transformation"
yeah, that's a more accurate way of putting it
but if all self.in2hidden is doing is adding another layer of depth
the model has 2 stages: Whh turns the previous hidden state into the current hidden state, and Wyh turns the hidden state into the observed sequence value
then you're just modeling transitions between sequence values directly. which... sure? but that's not the model
If you can draw a graph of the neurons and the connections between them, and you can find a path from any neuron to some neuron in question, that neuron in question is part of the same network.
(ignoring the direction-ness of it)
When we say that we have two networks working together, it's just a useful split for us to understand what is happening. Although in the end when it runs, it's all just one big blob / mess.
(aka the "black box")
Well, it depends what you are doing, there are too many different types of networks to say that it's just one big black box in the end for all of them.
so if it's all one neural net
and you can unroll it
what would the net look like with 3 time steps?
For an RNN?
yeah
because if you look at this
the wyh and whh weight matricies are running in parallel
so I guess you could think of them as the same layer?
The unrolling is through time. You really only have the thing on the left, in terms of the neural network metaphor. The unrolling is because of the weight update method chosen.
hi, I use r"\frac{1}{2}$ to use latex in matplotlib labels. I'd also like to use the f flag. How can I combine them?
hmm ok, let me retry then
Hi all! I need to do binary classification of time series data. The problem that I have is that I have a few true samples and then I have the rest of the data, where there might be more unlabeled true samples. I am assuming can't just label the rest of the data as negative since this is not true. Has anybody encountered this challenge before or know how to approach it? Thanks!
hi Python gang, if we aren't working with Time Series do we need to keep our datetime64 feature? how can this help us?
Well part of training a model is it being able to distinguish between the two classes, if you don't know which datapoints belong to which class, it will be hard to train the model
If you have some dat labeled and the rest unlabeled it wouldn't be such a problem
but the way you put it, you only have one class labeled partially
and the rest is unlabeled
@fluid sigil
No you do not, having the data in chronological order and putting it into supervised form works
https://machinelearningmastery.com/how-to-develop-deep-learning-models-for-univariate-time-series-forecasting/ here is a good example
So for example I'm working on a machine learning model to use Warren Buffet's approach when choosing stocks, and we've gathered all this awesome data and we have a column year but how useful is this? So i'm wondering if I just drop it because it's just the year
It believe it is not useful
thank you
So question about scaling my friends, when we see our feature isn't a Normal Distribution do we use the scaling techniques to get us there or should we convert the feature to a log value for example. Which of the two do we do / should or are these the same thing just different techniques?
Introductory courses for Machine Learning and Deep Learning
MIT 6.S191, Intro to Deep Learning: http://introtodeeplearning.com/
Fast.ai online courses: https://www.fast.ai/
Andrew Ng's classic Machine Learning: https://www.coursera.org/learn/machine-learning
Note: This is a living list! Please @ me if you have additional suggestions.
@serene scaffold pin? βοΈ
> not doing a PR to the resources folder of the site repo
i didn't know if it counted as a site resource!
we have a machine learning section?
!resources
We're a large, friendly community focused around the Python programming language. Our community is open to those who wish to learn the language, as well as those looking to help others.
note the ?topics=data-science
anyway, by our own criteria, we can't add Andrew Ng's course, since it's in Octave or whatever.
(all resources all have to present the information in Python or be language-agnostic)
is there a "how to add a resource for stupid people" page
I'll do it 
ty
i was looking in https://github.com/python-discord/site
pythondiscord.com - A Django and Bulma web application. - GitHub - python-discord/site: pythondiscord.com - A Django and Bulma web application.
@desert oar they're all in here: https://github.com/python-discord/site/tree/main/pydis_site/apps/resources/resources
pythondiscord.com - A Django and Bulma web application. - site/pydis_site/apps/resources/resources at main Β· python-discord/site
if you just want to write the blurb for the other two, I will do the rest.
hm.. i haven't actually taken each course start to finish. i have watched sections of both videos though
well, let me know if you decide to write those.
6s191 is pretty good
im getting this error while converting the .weights into the corresponding TensorFlow model files
hi guys. is pycharm recomended for data science?
not exactly recommended, but it's a good option https://datasciencenerd.com/is-pycharm-good-for-data-science/ As a beginner, I prefer Jupyter Notebooks though, because it's much easier to make code segments that I can run individually. I want to check the result after every few lines, because I'm still learning. If you're already more skilled maybe you won't need that anymore
ah i see. i am very2 newbie. and start study with modul. now i use google colab π because i dont need to instal anything. but google colab need internet. my friend said that anaconda is better than colab. but i need other recomendation. i will try Jupyter. thanks a lot.
yeah Anaconda also includes Jupyter, but Anaconda has too many other options for us beginners. Glad to help π
my computer not strong enough. i already instal it yesterday. but maybe i should only instal jupyter. how about vscode?
hi
im tryna do smth like this
but for classification
would i have to change the baseline model to logistic regression?
yeah that's what I mean, only Jupyter is best. VS Code I've only used a bit with Python, didn't like it at all, very unreliable with package imports
Hello, I'm in need of suggestions or guidance on finding material on learning python since Ill be needing it for my AI module at uni, any help?
!resources in general - though most of the time AI doesn't really requires much in-depth knowledge about the python language, just enough to glue things together.
If you have experience with any other language, or if they introduce the language a bit, you should be fine
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
def my_metric_fn(y_true, y_pred):
print('=>',y_true.shape, y_pred.shape)
return tf.norm(y_true - y_pred)
model.compile(
optimizer='adam',
loss='mean_squared_error',
metrics=[
my_metric_fn
])
# fit the keras model on the dataset
history = model.fit(X_train, y_train, epochs=150, batch_size=8, verbose=0)
# evaluate the keras model
_, mse = model.evaluate(X_test, y_test)
# print('Accuracy: %.2f' % (mse*100))
this is my code, now here i see shape of y_true and y_pred as (8,8) and (8,8)
its good to note that i have 50 columns, and y i get at the end is a vector of 8.
So i assume we have this (8,8) as in 8 vectors?
please let me know if my understanding is correct.
I basically want to find predict the vector, or nearest to test vector, which metric should i put over here?
thanks for your answer π π
vs code works perfectly fine in low end pc like mine , but i personally recommend pycharm
hello guys , wanted some help for creating an AI assistant , if any help regarding modules and all please tell
I've been assigned to develop a CNN for binary image classification. The images are mostly of people, and the classes represent whether a well-known celebrity is present in the image. I thought it would be wise to get some feedback because I don't have much experience with neural networks beyond creating toy examples.
I have URLs for a few hundred images to use for training and validation, and I'm downloading them right now. As a first step, I'm planning to modify a pre-trained classifier to output predictions for two classes. I expect this won't work very well, and I'll have to experiment with hyperparameters to get a good model. Once the images are in hand, I'll then have to process them and perhaps create more training images by flipping or blurring.
Right now I'm planning to use r-keras for creating the model, which is more or less an R interface to keras. I know Python, but I'm more comfortable with R for machine learning, and in any case the remainder of the project is entirely in R. Of course, I might use a different framework if r-keras proves too constraining.
Given my inexperience, I was wondering whether I'm omitting anything or planning to do something that's a bad idea. The classification task is for academic research, and my supervisor says it's on a small enough scale that it could feasibly be done by a human coder, but that automating tasks like this is a better practice, which I wholeheartedly support.
I actually found out something. .evaluate() would give the loss for an example whereas .predict() would give the output. So, .evaluate() actually takes it a step further by finding the output and then printing the loss wrt the cost func.
hm but that is for testing no?
Gotcha
Basically @serene scaffold was talking about how anaconda isn't used in industries and hence I thought of what might be the replacement for using anaconda
I wouldn't go as far as to say that it "isn't used in industries" (other companies might use it, but mine is deliberately trying to move away form it)
Yeah. Those are functions used for evaluating models.
Oh, alright.
oh i see, yeah alr!
cool
One question. I've seen people computing the cost function as :
Bias + w1x1 + w2x2 +... + error .
Why is the error term being used? Plus, I've also seen videos where the error term is avoided in the function.
I think without adding error, small changes in model cause model to become unfeasible.
Hi Absy, that's not how to compute the loss function. The equation below is the original regression equation.
Y = B0 + B1x1 + B2x2 + e
where:
Y = Yield (in ML lingo 'Label')
X = Explanatory variables (in ML lingo 'Features')
B0 = Intercept (in ML lingo 'Bias')
B1 & B2 = Slope (in ML lingo 'Weight')
e = The error/residual term (in my class I could recall a prof also calling it Gaussian noise)
We need to factor the error because the model in its core is making a calculated guess (Y_hat) which is likely not always gonna 100% provide an exact answer. (actual Y)! And surely as you know, any given guess would be off by a given margin (which is the error term)
If we take out the error term or refuse to acknowledge it in the first place, then we most probably won't be able to find a function that fits the data. Even if we do, the function will be notoriously horrible that we wouldn't wanna use it on any new data.
It's just as good as saying "I have a coconut head therefore I'm not interested in my model generalising"
=================
Once we've computed/fitted the data to the model, the regression line of the predicted value then changes to
Y_hat = B0 + B1x1 + B2x2
So by factoring the error term in the first equation, we can afford to say we have a heuristic model as opposed to deterministic model where the error term is always zero.
We're interested in having a heuristic model because it involves some level of estimation.
So by bringing in the error term, we're accounting for a sort of error which the model would make and hence preventing the model from overfitting. So then why use the bias term too?
by small changes do you mean that changing the hyperparameters of the model would make the model unfeasible in the context of not including the error term?
Not exactly, that alone doesn't prevent overfitting. Rather, I'd say once we admit we have an error term in the first place, we can analyze and minimize the error by first taking the difference between Y and Y_hat, and subsequently minimizing our loss function (MSE).
Remember some of the assumptions of OLS: as regards the residual yeah?
i) No autocorrelation i.e The error terms are uncorrelated with each other.
ii) Homoscedasticity i.e the error term is a random variable with a mean of zero and a constant variance.
Residuals follow the distribution of our error term. Hence if residuals seem to not be independent and identically distributed, then perhaps it's time we fit a different function, change our predictors, change our estimation method, or even transforming our data.
Does a dataset need to be in a specific format in order to work
I'm trying to make a dataset
it needs to be consistent. beyond that, it's up to the data scientist to transform the dataset into an input for what they're trying to do.
I still dont understand what data warehouse is
Hey, soon I will be conducting exercises regarding unsupervised learning. Can you suggest me what algorithms or exercises would be interesting? I was thinking about using supervised learning to detect anomalies in data using IsolationForest etc.
k-means clustering is usually the go-to example of unsupervised learning.
I think it's basically the same thing as a database, but for storing data that can be used for analysis, rather than for persisting data for a web service, or something.
a coworker recently introduced me to the idea of a "data lake", which I guess is the less-structured analogue of a data warehouse. The content of a data warehouse has a specific structure that the creator of the warehouse intended, whereas a data lake contains data in whatever format it was found.
Also, with all these terms, I think there's a point at which people just try to coin terms to prove how relevant they are, or something. The distinction between a "database" and a "data warehouse" and a "data lake" is probably situation-dependent.
I have heard of data lakes too and their definition is similar
But it was a couple of years ago and not that recent
Could someone help me out please?
I have two numpy arrays.
The first (input) is 27 columns of 1s and 0s, 5478 rows total
The second (output) is 3 columns of 1s and 0s, 5478 rows total
I'm using the tensorflow library on google colab, but I'm unsure which kind of model I should use for this, or which training method I should use
Any advice?
it would be more concise to say that you have two arrays of ones and zeros, of shapes (5478, 27) and (5478, 3). but we can't really tell you what to do with them unless we understand what they represent and what you want the model to be able to do.
The first array represents boards in tic tac toe
The second array represents the score for the board (winning, drawing, losing)
I want a model which can take a board state, and tell me if it is winning, drawing, or losing
I see. And I assume you know that you can get 100% accuracy with a simple set of rules, and that this is just an exercise?
It's my first time making any neural net, so it's just an exercise to learn
ah. well, is 5478 every possible tic tac toe board for a completed game?
I know I dont need a neural net to do it, i just want to see how to do it
its every board position you can reach from the beginning empty board
including the empty board as well, which is drawing
I'm not exactly sure what neural architecture would be right for this, but you want to deliberately overfit the model. do you know what overfitting is?
yes
its an intentional overfitting
since there's a known, finite set of inputs for this model, you might as well make a model that just memorizes them.
I don't know how to go about that
I'm not exactly sure myself. I would look into how to make neural networks that memorize the training data.
at the moment in trying to use a model that is sequential, with 25 layers, and then compiling the model for mean square error, then fitting the model
(which again, undermines the whole point of neural networks, but this is just for education.)
you aren't "compiling the model for mean squared error" per se. you're compiling the model, and the loss function you've selected is mean squared.
yes
I don't know the correct grammar or terminology for discussing neural nets π
also, the concept of "compiling a model" is specific to tensorflow/keras. it's just a design/terminology choice that they made, rather than a concept in neural network theory.
yeah, i tried writing this exact model in python (without using the TF libraries), but it was taking way too long on my machine
yeah, you don't want to do machine learning in "pure python".
do you have a GPU?
well, I guess you're using colab.
the advantage of GPUs is that they're massively parallel on the inside, so they can crunch numbers much faster.
until the value returned by the loss function is always zero, I guess.
yeah, neural networks are computationally expensive to train without a GPU
Colab lets you use a GPU on Google's cloud. that's the main advantage of it.
the answer is rarely obvious for this kind of thing.
Howdy y'all. Because I'm terrible at CV, I thought I'd bug you peeps for some thoughts. I'll give the toy problem + my current approach + my ask here. This is not an interview problem or homework, this is me trying to be less garbagio at some CV stuff.
Toy Problem Context: Suppose you have a video which is, for simplicity, 10px width by 10px length. Say that in this video is a pixel moving up and down in a sinusoidal manner --- but, sometimes, it'll "jitter" and break the sine. The problem is to find the time periods where the "jitters" occur.
My Plan of Attack: I'd like to make this into a timeseries, since I know how to do anom detection on that stuff. Since I haven't worked much on CV, I'm not exactly sure what techniques are used for things like this, if any. Right now, I'm taking the indices where a pixel is at t = t0 and plotting that as a TS, which works okay, but I feel it is not scalable to things repeating with 2D motion (eg, pixel going around in a circle, or slightly more complex shapes --- we can project these to 1D, but I'm not sure how well it will do).
My Ask: Given this, does anyone have resources for how people usually take CV-things like this and detect repetitive behavior?
anyone did reinforcement learning on LunarLander before?
Is it possible to have output of NN within like 15 to 20ms??
Bias is quite different from the error term. In Stats it's called the intercept i.e the expected mean value of Y when all your explanatory variables (Xi) = 0
Although, some would interpret the intercept as the point where your function crosses over to the y-axis.
In milliseconds? π€ Well, impossible is nothing these days - - so I'd say it depends on the size of your data and the kind of architecture your NN has.
and your hardware π
^
Bout to say small models in train will take about 60ms~ on my 2060 / a batch
Inference is bound to be a lot less.
@upper spindle try asking your actual question about LSTMs, not if anyone knows about them.
ohh okay, sorry im quite new to using public discord chats
does anyone know how to setup a LSTM model for forecasting cryptocurrencies with reddit posts/comment sentiment values?
No problem. On this Discord, or any other real-time chat, you're always more likely to get an answer if you jump right in to your question and give people enough information to start answering it should they glance at the channel.
have you already figured out how to scrape reddit posts/comments (using the reddit API only)?
i have yes
thanks for that
great! and I'm assuming the posts/comments have timestamps. Do you also have a list of crypto values by date (so you know the value of that currency over time)?
yes, and i do have the crypto values (imported from yahoo)
Not really, it depends on the problem and there are many different approaches out there. You can find endless outlier or anomaly detection papers out there for vision related tasks, but they almost all depend on the specific problem context. One neat thing about video is that a human can directly watch it and spot anomalies, but humans can't really go through a giant table of numbers unless they graph it. Vision is the dominant input form. In general you will probably find some time series prediction thing (if it's a time series problem (standard stuff)) and then that is used to detect outliers or anomalies (can be supervised, semi-supervised, or unsupervised). In this specific problem you are not expecting sudden jumps, but smooth motion, so you can use that to determine if something was a "jitter".
How do you know that the reddit content that you're using relates to the current or future value of cryptos? are you only pulling from specific subreddits?
CV is often more focused on finding out where that particle would be based on the image input. After you know where it is you can do normal stuff.
Hm, yeah, I'll prob look around the landscape and see if I can transform this problem into something I know. That's good to know, though, thank you. I've got my little crummy model now, and for more complex things I'm just mapping in a weird way to 1D and doing a time series from that. :']
only pulling from subreddits (bitcoin, ethereum, solana), i have pulled all of the comments and posts for 2020 & 2021
I'll share it when I figure out something cool, and y'all can laugh at me and/or tell me what they'd do. :p
There are mapping motion to 1D methods.
Re: the mappings, yeah, I honestly think that's what I'm gonna be using. I'm trying different ones, but this kind of seems like the easiest possible way to go.
I found a few papers on it, but, yeah, very situational. I'm using very easy ones now (circular motion, mostly) and that'll prob work, but I think it'd be cool to mess with some other ones.
The problem with methods that map to 1D is that they are often used in a way that also filters out noise, so the jitters would be ignored.
import pandas as pd
from psaw import PushshiftAPI
import datetime as dt
r = praw.Reddit(client_id = "...", client_secret = "...", user_agent = "Sentiment")
api = PushshiftAPI(r)
start_epoch = int(dt.datetime(2020,1,1,0,0,0).timestamp())
end_epoch = int(dt.datetime(2021,12,31,23,59,59).timestamp())
comments = api.search_comments(after=start_epoch, before=end_epoch, subreddit='Solana', limit=1000000)
for post in posts:
submissions.append(
{"Subreddit ID": post.subreddit_id,
"Subreddit": post.subreddit,
"Title": post.title,
"Body": post.selftext,
"Number of Comments": post.num_comments,
"Score": post.score,
"Time": post.created,
"URL": post.url,
"Post ID": post.id,
}
)
But you can totally have it accept all the noise.
That is my code above to get the comments
Yeah; right now, I'm mostly doing a kind of "threshold" on a grayscale image and projecting its darkest points down and working with that. It works okay for my specific problem, but it def would not work for anything more complicated.
CV's bigger issues is dealing with all the complexity / noise of finding out where the particle is in the first place.
After that it's regular time series stuff on a 2D input.
Well, maybe 4D if you include velocity via motion detection.
Ofc, there is also all of https://en.wikipedia.org/wiki/Particle_filter , since this is a physics-y thing.
Particle filters, or sequential Monte Carlo methods, are a set of Monte Carlo algorithms used to solve filtering problems arising in signal processing and Bayesian statistical inference. The filtering problem consists of estimating the internal states in dynamical systems when partial observations are made, and random perturbations are present i...
To help out in this context of moving particles.
Huh, neat. This seems similar to Kalman, but I've not used a lot of either.
Example usage: https://www.youtube.com/watch?v=elqAh3GWRpA
Today we will discuss how the most powerful exploit in server history caused the fall of Minecraft's 2b2t, the oldest anarchy server in the game, and the fallout of the events that took place.
My Twitter: FitMC
My Instagram: fitmcsippycup
Music:
Tekken 6, MGR, FFXV, NMH
Additional 2b2t Footage/Information:
rebane2001: https://www.youtube.com/c...
"After being located, a Monte Carlo particle filter was used to keep up with the player. This is an adaptive system that learns over time (after just a few seconds, really) where the player probably is, and their movement speed."
Huh, alright, coolio. I'll check this stuff out. Thanks for the references!
Also you could use Fourier analysis for this specific problem (find the sinusoidal while ignoring jitters and then predict with it).
Anyhow, CV is more focused on even knowing where the objects are in the first place, than it is with where the objects will go (except for some basic motion detection (probably optical flow)). Well, it's still part of the CV problem, but after you used some CV method to figure out the higher level details (like "my object is over here"), you can use regular stuff. This all changes if you want to do something end-to-end-like (when you start doing NN stuff).
Hi all I had a question, I am runnning two models, a simple one and complex one. On the complex one I interact polynomial variables with categorial variables. I've noticed one categorical variable always screws up my MSE and R2, as in individuals that have that categorical variable == 1 tend to be huge outliers when we predict them. How can I Ex-ante deal with this problem? It's not that case that individuals who have categorical variable == 1 also have large Y variables, so am very very confused.
I have any output + code needed, this is for a learning project, nothing too serious
just making me feel crazy
@mild dirge you were helping me with this yesterday, I have more or less pinpointed everything down to what variable is messing me up, now learning how to actually deal with it haha
are you normalizing data yet?
A tide-predicting machine was a special-purpose mechanical analog computer of the late 19th and early 20th centuries, constructed and set up to predict the ebb and flow of sea tides and the irregular variations in their heights β which change in mixtures of rhythms, that never (in the aggregate) repeat themselves exactly. Its purpose was to shor...
No, could you send information on that? I have another suspicion, the data has an experience variable and "exp1" "exp2" "exp3" "exp4", however if these were polynomials they don't mathimatically make sense
Yeah, that's approx what I'm doing after I get the time series out. :'] Haha, it's a sweet solution.
It's making sure the data lies in the range of 0 to 1 for every variable
"They came to be regarded as of military strategic importance during World War I,[4] and again during the Second World War, when the US No.2 Tide Predicting Machine, described below, was classified, along with the data that it produced, and used to predict tides for the D-Day Normandy landings and all the island landings in the Pacific war."
Ahhh I see, just put everything into standard devs?
often MinMaxScaling is used
interesting
Examples using sklearn.preprocessing.MinMaxScaler: Release Highlights for scikit-learn 0.24 Release Highlights for scikit-learn 0.24, Image denoising using kernel PCA Image denoising using kernel P...
Kelvin was way too smart.
Beautiful, ill look into it, im coming from econ research so we normally put stuff in standard devs, let it rip
thanks for your help pccamel, much appreciated
Another fun solution is using a self organizing map: https://en.wikipedia.org/wiki/Self-organizing_map
A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the topological structure of the data. For example, a data set with p variables measured in n observations c...
It can map the image to 1D, and similar positions of the point will have similar positions in the 1D column.
(Next to each other in output)
Woah, this looks legit, I've never heard of this one!
Hopefully my original answer makes more sense now, I can't really provide resources because everyone just kind of does their own weird thing that just works for them.
Yeah, these are good resources, I'll try'em out and see what works! Thank you!
And IDK of any summary blogs.
I guess these days many would immediately jump to LSTM or something, but I think that's often a bad idea due to the amount of computational power needed, training time, and LSTM is not even very good anyhow (at least use GRU).
A decent to good solution almost always involves the information about the specific problem and LSTM does not really make use of that by default.
It's sort of like brute forcing the problem (because it's generic prediction with no specific parts to the problem / just massive gradient descent with lot's of parameters) vs realizing that you can do much less work.
i definitely used it in industry and it was much better than the alternatives, because containerization was not an acceptable option for our workflow
but it still has some "sharp edges" and has a learning curve to work with it
What's up Python gang, when we are one hot encoding, are we supposed to Scale BEFORE on our features or do we scale AFTER.
Sometimes I forget simple steps in this process :((
@thin palm just so we're on the same page, what do you mean by scale?
Scale as in feature scaling for example I'll be using RobustScaler()
but not sure if I should first One Hot Encode my column and then scale or do opposite
Do you understand what peoplem one-hot encoding is intended to solve?
To my knowledge it is the ability for our machine learning model to interpret text as number for our model.
It's not just text. It can be used to represent any discrete feature.
Do you know what discrete means?
In this case, it might be more accurate to say that one hot encoding is for representing nominal features. So instead, tell me what you think a nominal feature is.
I don't know if you're still there, but since one-hot encoding involves having a vector of all 0s except for one 1, and that vector represents a value that is not quantitative (like a word, in your case), there is nothing to scale.
I have recently wanted to start learning Data Science/ AI. Where should I start? (Sorry if this question is asked too many times)
Hello, Can anyone suggest essential skills that is required to start freelancing as Data Analyst
For tensorflow, trying to solve traveling salesmen problem,[location id, coord x, coord y] finding shortest path cβaβb,
['a',0,0],
['b',3,0],
['c',0,4]
]```
but with flexiable number of locations, that means flexiable amount of features, how to train tensorflow regressional model??
what's ols?
I get that but won't some people increase the bias in the equation so as to prevent overfitting?
By alternatives do you mean venv?
I need that response on device like mobile
does anyone here know how to use sentiment values from reddit comments to forecast cryptocurrency prices, or have done a similar project?
I want to find a fellow dev with experience (even so slightly) training and setting up for example huggingface models.
you have to ask your actual question, not if there's someone who knows about a general concept. didn't I mention that before?
read my previous comment ^
Thank you.
What I am actually looking for is a developer to work with. I am a frontend-dev looking for a partner for a project.
Alright. Keep in mind that it is not very likely that you will find an ongoing project partner in this server. I'm not sure where you can go to find that.
Any ideas where I can look?
I said "I'm not sure where you can go to find that." in anticipation of that question; not really.
Hi guys, what libs do you recommend for image recognition? (I wanna detect if there is $x bill on image)
do you think bs4 questions are better answered here or in webdev ?
Question about image normalization. I'm planning to use the Resnet18 pre-trained classifier on a training set of .png images. Because they are pngs, some of the images, when converted to raster arrays, contain an alpha channel in addition to the usual R, G, and B channels. That channel has to be removed somehow, because ResNet18 expects tensors with three channel dimensions, not four. A brute-force solution would be converting the images to .jpg, which would remove the alpha channel entirely. Is there a better approach, or is conversion a reasonable solution?
you load the images as something before feeding it to the model right? can you just drop whatever you want then?
were just talking about dropping a channel right?
or a column...
That would also be possible. The alpha channel is always the last, apparently, so I can just drop it from the array.
5k calls just got traded on TWLO and NOTHING shows up on UW
I have to go back to cheddarflow or OptionsFlow
you get what you pay for
whoops
wrong place
hm so I just found out that tf does auto differetiation of the nonlinearity functions, hence we don't need to provide differentiated eqN as another function in case we give our own nonlinearity function.
THATS JUST SO COOLπ
I'm new to this and I'm having issues understanding how to fit a logistic regression model around a particular dataframe I'm working with. If someone could reach out to help I would appreciate it.
https://github.com/jina-ai/docarray can be interesting to data scientise/AI engineers working with unstructured data
are there any good tutorials for tensorflow object detection that aren't outdated and doesn't give me error every other lines
i've been trying to learn it for 3 days now and i encounter tons of errors every single day
OLS = Ordinary Least Squares. It's a technique used in Statistics for estimating the relationship between one or more explanatory variables and a response variable.
In essence, the method estimates the relationship by minimizing the sum of the squares in the difference between the observed and predicted values of the response variable configured as a straight line.
OLS and Gradient Descent does same thing but with different approach.
The bias I was referring to was in the context of b in y = mx + b. But when it comes to Bias-Variance tradeoff, to solve the overfitting problem you'll have to reduce your model complexity. By doing so, the variance decreases and bias increases.
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import tensorflow as tf,tensorflow.compat.v1 as tf
inputs = tf.ragged.constant([[1,2,3,4],[],[5,6,7]])
output = tf.ragged.constant([[2,4,9,16],[],[25,36,49]])
model = tf.keras.models.Sequential([
tf.keras.layers.Input(shape=[None], dtype=tf.int64, ragged=True),
tf.keras.layers.Embedding(1000, 16),
tf.keras.layers.LSTM(32, use_bias=False),
tf.keras.layers.Dense(32),
tf.keras.layers.Activation(tf.nn.relu),
tf.keras.layers.Dense(1)
])
model.compile(loss='mean_squared_error', optimizer='sgd')
model.fit(inputs, output, epochs=5)
print('WARMINGS DONE')
print(model.predict(inputs))```
```1/1 [==============================] - ETA: 0s - loss: nan
1/1 [==============================] - 3s 3s/step - loss: nan```
Epoch losses are `nan`, is it layers problem?
okay I think I see where you're going with this. In the bootcamp I attended they only went a few hours in depth about one hot encoding and emphasized it text to numbers essentially. Thank you for the clarification on this.
No problem!
I'm sorry idk about mobile.
Are you a data scientist or ML enginner??
They are.
Needed to take some advice from them as my placement season is approaching
Sure, but always direct your questions to the whole channel, not specific people.
If you need career-specific advice, try #career-advice.
what's up Python gang any advice on how to make sure your ML model is a decent one and how we can further examine it? I've been checking around Confusion Matrix which I enjoy but some of my scores are looking INTERESTING
There's lots of stuff
you have accuracy, F1-score, precision, recall, confusion matrix etc.
There's also some more meta stuff like prediction time
well just when I ran a cross_validate on my machine learning model it produced a score as high as .88, but when I do a .fit on my X_test it shoots down to .57
and I'm just thinking, these scores are way too far away from eachother
that's a pretty clear sign of overfitting
ahhhhh
So my Algo is using the KNN Classification
and I used n_neighbors as 5, which I would assume is not overfitting. How can I further investigate?
depends on how many samples you have in your training data
if you have bilions of samples, 5 would be pretty low
samples you mean how i split my X_train and X_test
Well there's multiple causes for your model performing worse on your test data
one could be that your test data is somehow not similar to your training data
Another could be that you don't really have enough samples to get an idea of how well your model performs
say you have 6 test samples, and you get 5 right, doesn't say a whole lot about how good your model is with this few samples
And even knn can overfit, if K is very low, it might overfit on your training data
Don't think that by itself would explain the big gap between test acc. and validation acc. though
Hi if I wanted to get started in machine learning or AI what are some prerequisites that I would need to know.
statistics and linear algebra mostly
Is there no need for calculus
There is, but it does not play as huge of a role as those two, it's probably a good 3rd though
So the stuff on khan academy should be fine right? or should I get like a udemy course.
oh ok, thanks for replying though.
heyyy. I am looking for a way to find the x-coordinate of the intersection between a graph and the x-axis using matplotlib and numpy ? I really don't know how to figure it out
You got an array of y values or something?
@gleaming remnant
Need more information to help
Can we voicechat ? I will show you my screen
I'm using a formula to graph the trajectory of a projectile
It is for a physics assignment
It's probably best if you open a help channel #βο½how-to-get-help and provide all information necessary to answer any question you may have
Hey @gleaming remnant!
It looks like you tried to attach file type(s) that we do not allow (.html). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.
Feel free to ask in #community-meta if you think this is a mistake.
!paste
Pasting large amounts of code
If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.
What do you intend to work on next? The purpose of your project will inform the kind of dataset needed.
a uni student
If you want to start with computer vision I recommend MNIST handwritten numbers
Any numpy / pandas pros in here?
Always ask your actual question, not if there's an expert.
Hey guys, im working on a project that involves python and IMAP. Im quite stack on how to build a criteria that the script will use to filter/ search the emails and retrieve attachments with a particular extension, sent during a particular date and from a particular address. Any help??
Oh okay, so I am looking at data in a dataframe, I select the value using dataframe_result[i][m] and use it to compare to value dataframe_reference[i][m]. I ran it once works perfect, go to run again it throws 'The true value of an array with more than one element is ambiguous. Use a.any() or a.all()'. So okay fine, I go result[i][m].any(), etc. Now I get "error 'int' object has no attribute 'any'" WTF???
Been trying to solve various ways for 3 hours
Any big daddys in here that can help?
So confused how it complies and runs perfect and then breaks as well
Can I force this to run and just be done with it?
Would having formulas in an excel sheet f it up? Like ROUND?
please show the code and the whole error message starting from Traceback
!paste
Pasting large amounts of code
If your code is too long to fit in a codeblock in discord, you can paste your code here:
https://paste.pythondiscord.com/
After pasting your code, save it by clicking the floppy disk icon in the top right, or by typing ctrl + S. After doing that, the URL should change. Copy the URL and post it here so others can see it.
compile time and run time errors are different.
because of the way Python is designed, fewer potential errors can be detected at compile time.
No. Code follows the instructions that you give to it, and if you give it instructions that cannot be completed, it breaks.
I understand that you're trying to get something done here, but you need to maintain patience and a positive attitude about learning.
Sorry man, I've tried many different things, but it's not working and it's frustrating that it once worked and I changed nothing
Unfortunately I run discord on the PC I'm using for this
Cannot run*
hey guys quick question
what is information gain
reduction in entropy
so as entropy decreases, information gain increases?
@knotty barn unfortunately there's not much anyone can do to help unless you can share the information I asked for earlier. I'm sorry this is frustrating for you.
Keep in mind that Discord can be used in the browser.
Can I send a pic lol?
I miss the old days when my code worked lol
How can my value be considered both an int64 and an array?
I don't look at pics of code, but someone else might.
It might be an array of 64 bit integers.
If it were an array then I could use .any() or .bool() on it, but says it's not an array
But rather an int64. Then when I try to run it as if it were an int64, it says it cannot do that because it's an array lol
Clearly there's a misunderstanding here. See how creative you can be about getting the actual text into this Discord. For my own sanity, I draw the line at reading screenshots of text.
I pull off .csvs, but I can rewrite the code I guess
you can drag/drop CSV files directly into the Discord client.
So you won't have my data tables, but I can type the code on my phone lol
I'm skeptical that you can't log into Discord from a browser, though.
I can, but I am not allowed to do from my work PC
rip
Lol yep
as a last resort, I usually email stuff to my work email.
my company blocked all non-work email inboxes because too many Karens in non-technical roles open every email they get π
Can't email any files or information to my personal email from my work email lol

well, yeah, in information theory information literally is negative entropy. For that reason, a perfect way of compressing information produces a result that looks like random noise, with no regularities whatsoever - because any regularity in the data would be predictable, and so you could improve the compression by dropping these predictable parts.
y_actual = result_db[i]
y_hat = reference_db[i]
if y_actual[m]==y_hat[m]==1: <- this is the error
There's a lot of stuff going on but essentially result_db and reference_db are dataframes with columns of 0s and 1s
i = column header label, and the m = the row within column... or so I thought until it broke randomly
Huh, TIL that you can do x == y == 1 in Python. I never saw that before.
Spice, if you print out y_actual[m] are you getting a series or an integer? You may also want to use loc or iloc to get rows / columns for pandas.
Usually, that any/all error means "you're comparing a series / df to a series / df, and it's giving us a lot of true/false values."
In fact, as I just learned, the double double-equals is messing you up, probably. It's trying to compare the second thing to 1 --- that's a data frame to an integer. EDIT: Removed long code example, since the error is easier, see below.
So, you might want to use something like this instead of the x == y == 1 thing: (df["pred_value"] == df["true_value"]) & (df["pred_value"] == 1).
y_actual[m]==y_hat[m]==1 is equivalent to (y_actual[m] == y_hat[m]) and (y_hat[m] == 1). If these are series, that and will fail
for elementwise AND on series you need & instead of and, and so must write it explicitly
I do not like that syntactic sugar. Also, dang, x <= y <= z works too? Where have I been...?
yeah, they are all very nice
you can actually chain them arbitrarily long. Also, somewhat cursedly, in is also a chaining operator
!e
import dis
dis.dis("a < b in c > d")
@tidal bough :white_check_mark: Your eval job has completed with return code 0.
001 | 1 0 LOAD_NAME 0 (a)
002 | 2 LOAD_NAME 1 (b)
003 | 4 DUP_TOP
004 | 6 ROT_THREE
005 | 8 COMPARE_OP 0 (<)
006 | 10 JUMP_IF_FALSE_OR_POP 14 (to 28)
007 | 12 LOAD_NAME 2 (c)
008 | 14 DUP_TOP
009 | 16 ROT_THREE
010 | 18 CONTAINS_OP 0
011 | 20 JUMP_IF_FALSE_OR_POP 14 (to 28)
... (truncated - too many lines)
Full output: https://paste.pythondiscord.com/ivazecuhix.txt?noredirect
Haha, I do not like that.
But the x <= y <= z one might be useful. I use this a significant amount for timeseries stuff.
absolutely. sadly, due to the and thing, can't be used if the result isn't a single bool
But it's going to be... is this x <= y and y <= z, but it's not necessarily transitive, yeah?
Like, if you have a weird custom operator for <= like "direct child of" for example.
Yeah, no promises. It always gets interpreted as the equivalent of x<=y and y<=z.
Yeah, that should be fairly clear in very specific situations. Hm.
Haha, I always have to think, "Is this going to be readable?" before I try to change any of my code style stuff.
Python be like: ```py
2 <= (x := 3) <= 4
True
Walrus Operator: Greatest Operator.
They added the ability to assign in inside expression, which people in C hated for code understandability.
Adding bad features from C is hip now though.
I'm still unsure how I feel about walrus, stylistically. I've used it VERY rarely for if-statement stuff, but I don't know how readable it is in general.
if value := my_long_function_name_that_i_dont_want_to_type_again(things):
Wait until you find stuff people used to do in C like: ```py
x = (x := 3) + 1
x
4
But it gets much worse.
Yeah, I do not want to use it like that.
I feel like if I use it, I'm going to limit it to the if use case.
Yeah, that's --- yeah, nope, not for me.
And the classic, editing two vars at the same time: ```py
x = 1
y = 1
y = (x := x + 1) + 2
x
2
y
4
IIRC you can even get UB in C by doing two assignments on one line like that?
What's UB?
right, i = i++ + ++i; is UB.
It depends where, if you do it as the function arguments yes, because there is no spec on which order the expressions of function arguments are run.
I do not know C enough to know what the hell this is. Haha.
e.g. foo(x++, ...)
undefined behaviour. Something that you're supposed not to do, and the compiler is allowed to assume it will never happen, and the compiler is allowed to emit code that does literally anything if it does happen.
Ohhhh, got'cha, got'cha.
C code relies a lot on undefined behavior, but if you want it to be as portable as possible you want to minimize it.
I've been working on learning the black codebase, with the AST stuff, so that jives with what little I know about programming. :'''']
It's undefined from the language POV, but not the system/hardware.
I think there is also a C compiler that messes with everyone's expectations like how int is 32 bits.
As a joke.
(int can be any number of bits, depends on system, and compiler)
If I had an infinite amount of time, I'd try to learn some C-or-lower, but, alas. Life is only so long.
Well, C is a pretty simple language, the complexity all comes from how to make decent code with it.
It has a lot of traps, like its standard library which should not be used.
Haha, that's what I mean --- architecting things with C. It would take me far too long and far too off-field for me to think seriously about doing it. :''']
(old, outdated, for a system nobody uses anymore, and will give you lots of security issues)
(unfortunately, its standard library is people's first contact with C, and it leaves a real bad impression that scares people into using something like Python (or, as it happened in the past, to Java) and never turning back)
Welp, that's it, I'm only going to use 6502 ASM from now on. :']
(the std lib also teaches all the wrong things, it's how not to code in C)
(it's also why every C (and C++) programmer kind of has their own standard library / bag of tools that they use, and often get pointed at as suffering from "not invented here syndrome", when really it's just lack of a good std lib)
(On the other hand, C is really one of a kind, the universal programming language that is simple and does not change too much while having important stuff like an ABI)
(And now with web assembly (lol, full circle huh?), it can really be used for anything)
I hope Python does not really change anymore, keep it simple (or at least not more complex than now). It has all it needs. Modules, a decent std lib, etc. The only thing it needs, which is being worked on right now (HPy), is universal modules that work on all Python implementations.
All I want for my Python Christmas is for more popular packages to put in type-hinting. :'] But I agree, I like it as it is.
It's already flexible enough for whatever due to operator overloading.
Type hinting, yeah.
The HPy thing btw, would let CPython also improve the GC and get rid of the GIL.
So get that checked off the list.
Does anybody have any experience with solving constraint satisfaction problems? Trying to implement one for my masters project, but I'm having some issues with working out the best approach for the problem
Thanks guys for the help, haven't tried it but makes sense to me
MIT 6.034 Artificial Intelligence, Fall 2010
View the complete course: http://ocw.mit.edu/6-034F10
Instructor: Patrick Winston
How can we recognize the number of objects in a line drawing? We consider how Guzman, Huffman, and Waltz approached this problem. We then solve an example using a method based on constraint propagation, with a limited...
7, 8, 9
you don't like chained comparisons?
as ConfusedReptile was getting at, the contract in Python is that the dunder methods for comparison operators have to return a bool. But numpy and pandas types don't uphold that contract, and their __bool__ methods deliberately raise an error. So when chained comparisons are expanded, they cause an error.
the most recent steering council elections had "keep on changing stuff" and "slow down the changes" factions, and the "slow down the changes" faction seems to have won out.
I want changes in implementation and stuff like getting rid of the GIL, but not the language itself.
Any real gains that could be had with further changes would require very breaking changes. LIke static typing. The non-breaking changes just have gains way too small to offset the issues of version changes.
language changes are what I was referring to. Optimizations for tightly type-annotated code might be doable, however. (This is probably better for #internals-and-peps.)
Yeah, but even without the type-annotated code, as pypy shows, it can be A LOT faster already. We just need universal modules that work across Python implementations (Hpy).
To solve the whole thing where PyPy and others need to simulate being CPython, basically switch modes.
With something like that, many Python implementations could become drop-in replacements (especially PyPy (but also the GraalVM Python, etc)). Ofc, the other huge gain being (again) multithreading without GIL.
Good, i think they have changed enough for now and it's time to chill out
is there any good tutorials for tensorflow object detection that isn't outdated and doesn't give me errors every other lines
tbh I wish they hadn't done patma for no reason other than that it makes a hard boundary between 3.10 and earlier versions
Someone who wants to help me with my Python problem ? Send me a pb (: it is about NLTK, Categorial distribution.
Please always ask your actual question, giving enough information for someone to answer it, instead of asking if someone knows about a topic.
Hello Everyone, hope you're doing great!
Can I use target/mean encoding for regression problems?
Or what encoder would be helpful for regression?
i can't understand why the method cv2.erode working as cv2.dilate and vice versa in opencv
One more question, is there a way to do target encoding if I have 2 targets (multi-output model)?
Does anyone have any advice/resources to learn + practice python lists and algorithms and working with data to improve on those topics
you're looking to learn more about "classic" algorithms, like list sorting?
Yes, pretty much.
that's a question for the #algos-and-data-structs channel.
We're a large, friendly community focused around the Python programming language. Our community is open to those who wish to learn the language, as well as those looking to help others.
Oh wow.
yeah, looks like we only have two resources for A/DS. at least for now.
Works, I'll into those. Thanks
I'm trying to take a crack at the playing snake with an AI project using NEAT but I'm a beginner to NEAT and AI in general.
My current issue is that all my outputs are 0 and I can't figure out if it is because of my activation functions or is it a garbage in garbage out problem. I'm really looking for someone who is familiar with all of this to help me determine my problem and point me in the right direction
I would rather not post the code it because frankly its pretty junky and not really readable so I'm more so looking for help in my dms
If someone wants to help you in their DMs, I suppose they can, but you're much more likely to get help if you give information in the channel.
People typically don't want to give help in DMs because they'd have to back out if it turns out the question involves something they can't help with.
okay then ill provide more info
Also, literally no one is proud of any code they wrote more than like a year ago.
Currently my input data:
[snake_food_distance, snake_topwall_distance, snake_rightwall_distance,
snake_leftwall_distance, snake_bottomwall_distance, snake_indanger (this is a boolean)]
The distance values are in grid blocks instead of just the number of pixels in between the 2 objects.
My output nodes are [Turn Left, Keep Straight, Turn Right] (The problem is these are always returning 0)
My movement system is just changing 4 booleans to fit whatever the desired direction is. The booleans are up down left and right
How the AI controls the movement:
output = network.activate(list(player.vision())) #player.vision returns the input data listed above in the example
if max(output) == 0:
...
elif output[0] == max(output):
if UP:
UP, DOWN, LEFT, RIGHT = False, False, True, False
elif DOWN:
UP, DOWN, LEFT, RIGHT = False, False, False, True
elif LEFT:
UP, DOWN, LEFT, RIGHT = False, True, False, False
elif RIGHT:
UP, DOWN, LEFT, RIGHT = True, False, False, False
elif output[1] == max(output):
...
elif output[2] == max(output):
if UP:
UP, DOWN, LEFT, RIGHT = False, False, False, True
elif DOWN:
UP, DOWN, LEFT, RIGHT = False, False, True, False
elif LEFT:
UP, DOWN, LEFT, RIGHT = True, False, False, False
elif RIGHT:
UP, DOWN, LEFT, RIGHT = False, True, False, False
this is better than I was expecting.
I don't know if the issue is the data I'm inputting, my NEAT configuration, or the coding of my game in general
This is the NEAT config file (i ripped it from a youtuber because i was testing to see if it was my settings)
[NEAT]
fitness_criterion = max
fitness_threshold = 50000
pop_size = 10
reset_on_extinction = False
[DefaultGenome]
# node activation options
activation_default = relu
activation_mutate_rate = 0.05
activation_options = relu tanh
#abs clamped cube exp gauss hat identity inv log relu sigmoid sin softplus square tanh
# node aggregation options
aggregation_default = random
aggregation_mutate_rate = 0.05
aggregation_options = sum product min max mean median maxabs
# node bias options
bias_init_mean = 0.01
bias_init_stdev = 1.0
bias_max_value = 30.0
bias_min_value = -30.0
bias_mutate_power = 0.5
bias_mutate_rate = 0.7
bias_replace_rate = 0.1
# genome compatibility options
compatibility_disjoint_coefficient = 1.0
compatibility_weight_coefficient = 0.5
# connection add/remove rates
conn_add_prob = 0.5
conn_delete_prob = 0.1
#Γndra till 0.5?
# connection enable options
enabled_default = False
enabled_mutate_rate = 0.2
feed_forward = True
initial_connection = full
#initial_connection = full_nodirect 0.5
# node add/remove rates
node_add_prob = 0.5
node_delete_prob = 0.1
# network parameters
num_hidden = 0
num_inputs = 6
num_outputs = 3
# node response options
response_init_mean = 1.0
response_init_stdev = 0.05
response_max_value = 30.0
response_min_value = -30.0
response_mutate_power = 0.1
response_mutate_rate = 0.75
response_replace_rate = 0.1
# connection weight options
weight_init_mean = 0.3
weight_init_stdev = 1.0
weight_max_value = 30
weight_min_value = -30
weight_mutate_power = 0.5
weight_mutate_rate = 0.8
weight_replace_rate = 0.1
[DefaultSpeciesSet]
compatibility_threshold = 2.5
[DefaultStagnation]
species_fitness_func = max
max_stagnation = 5
species_elitism = 1
[DefaultReproduction]
elitism = 8
survival_threshold = 0.3
Hey @strong tapir!
You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.
https://paste.pythondiscord.com/apapelogix.yaml This is the full code
Seems to be an issues with your config file. Subbing your config file with the 1 from the NEAT XOR example gives non-zero output:
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
Ah, setting enabled_default = True is all you need to change to get non-zero values, but you also probably need further tuning to get good behavior.
does anyone how how to use a lstm to predict crypto prices with reddit sentiment values based off comments
ive got the dataset
and have generated sentiment values for my comments/post titles
this is my dataset
:incoming_envelope: :ok_hand: applied mute to @brazen lava until <t:1644583673:f> (9 minutes and 59 seconds) (reason: duplicates rule: sent 4 duplicated messages in 10s).
@upper spindle looks like the timestamps are precise to the second. What about your data about crypto values? Are those only precise to the day?
Hi, I have a question, how do you convert each column of dataframe to string so that I can split the value and put it in the new dataframe??
I already tried .astype("|S") but I think it turns out that the length of my column is too long for the asci encoder to encode (I think I need to convert it first to utf-8, but I dunno how to do that in a dataframe)
what exactly are you trying to do?.......
if you want to save it in a file or send it somewhere, you can use pickle, csv or h5
if you want to take some columns and make a copy of them, just use df[['col1', 'col2']].copy()
if you want to split (individual*) columns, you could use pandas.Series.str.split

there's also .astype(str)
and most DataFrame operations return a new DataFrame without changing the original, so it's unlikely that you'd need to actively put anything in a new DF.
yes
oh there is already a split function from pandas, OK I think this will do, let me try it first
okay, thanks I see
if that's the case, you probably need make the post timestamps precise only to the day as well. have you looked into time series forecasting algorithms?
also, it looks like you can reduce the precision of a timestamp by changing it to midnight for that day. .dt.floor('D') on the timestamp column should do that.
thanks for your reply, i havent looked at other time series forecasting algorithms yet,
what ones would you recommend
im quite new to programming and this is a project for my uni dissertation so hopefully it does work
no idea. my actual work is human language technology; I have never done time series forecasting.
are you familiar with LSTMs?
yes
Does anyone have a reference deep-learning notebook they turn to for what a good notebook could look like? I'm looking for something to model my notebooks off of
notebooks are for quick exploration or "telling the data story"
so if this isn't a notebook that you see as disposable (ie for quick exploration), I would strcture it around how it conveys the transformation of the raw data into a model.
but keep in mind that notebooks are "terminal". anything that's in the notebook should be regarded as having no life outside of it. if you're making a model that you actually want to use for non-demonstration purposes, you should get out of the notebook as soon as possible.
Thanks! That's useful advice. I'm a neuroscience grad student, so I'm just training computer vision models for research purpose
With the idea of "quick exploration" are there any packages useful for debugging models?
I'm thinking of things like tensorboard - or weights and biases (just to track long running ones remotely). Those are the only two I know of
my advise as a one-time paper author: any results that you plan to report in a paper, make sure those come from a regular .py file that does the same thing every time you run it. you don't want to find yourself in a situation where you can't prove your system works as well as you said it did because you can't remember the order in which you ran the notebook cells.
I honestly think having your stuff in a notebook isn't that bad as long as you can run them in sequence from top to bottom
If you want to use the model outside of the notebook, you could always save it
don't delegitimize my hatred of notebooks π
The're just so much more readable, but yeah wouldn't tunnel vision on notebooks either
but for smaller projects they're pretty awesome
I could see it depends, notebooks are super convenient for visualization, but I could see the use of scripts - especially if you're doing something like a parameter search
The consistent results is a good note... I need to figure out how to use random seeds properly :/
Do you have any example notebooks you turn to / people you follow? :p
at my uni we do a lot of group projects, so I try to look at other students' work a lot and see what is most readable
Can someone please tell me what is happening in this line I am not able to understand. df_age = df.groupby(["year","age"])["suicides_no", "population"].sum()
You group by the year and the age, and then you calculate the sum for suicide_no and population
if you print the result it might make more sense
f.e. for all data points with year value of 1968 and age value of 55, the sum of suicides and population might be x and y
Kind of surprised that line works, didn't know you could get columns from groupby's without aggregating
Would've written it like df["suicides_no", "population"].groupby(...).sum()
don't think that would work, as the new df wouldn't have the columns age and year right?
No one else really codes in my department, I'm the only with a CS undergrad
ah that's a bummer
yes....
This is why I'm not in cs still df.groupby(...).sum()["suicides_no", "population"]
now Im just being contrarian, I'd go with the original lol
yeah not super familiair with pandas either, regularly have to look that kinda stuff up still
same, I find it really useful though - especially the multi indexing
trying to switch to dask dataframes though
ooh okaay understood..Thaank you so much
@lime loom df["suicides_no", "population"] will cause an error; the two column names have to be in a list.
df['suicides_no population year age'.split()].groupby('year age'.split()).sum()
this is how I'd have written it. cuz laziness.
(I always make sure my column names have no spaces so I can use that trick.)
wow pca is so cool
what is that
principal component analysis
it does go brrr yes
but there are some serious disadvantages
i think i'm also starting to understand the andrew ng lectures
Not sure if this is the right channel. Anyone on this server who has worked with GIS?
I'm planning a GIS project and I'm currently entertaining the idea of doing a project with a python component
Ping me π
Yeah, PCA rules, but it does mess with your features and interpretability. :''''] There's a bunch of these dim-reduction things, and they're all pretty neat!
I have used QGIS in one project
let's suppose there's a way to build a strong AI, which license should the code be?
and if the code is generating code itself, is there a license for generated code?
The strong AI should consent to the license if it is conscious... should it get a share of the profits from the code it writes? It would be like an employment contract then if the entity is given rights by virtue of being a conscious entity
I mean, Copilot / Kite and others already sort of do that, to an extend?
iirc they are considered as tools though, so whoever's using them "owns" the generated code
but if it's not conscious at first and only gains consciousness as an emerging property.. like those scientists who are mixing together chemicals to see if life spontaneously emerges
would those scientists have discovered or invented life?
for instance, if this "seed-code" is GPL-licensed, would all other code derived from it (as in generated by it) also be GPL?
Reinvented LIfe maybe...or Discovered a process for making living entities
Depends on the level of conciousness of the entity and how much 'free will' we imbue it with the license agreement... what if it makes a decendant that breaks the GPL license since it wants more money
do you have ownership over your own code? can you be the author of your own genes?
Thsre is crispr cas 9 ....people are toying with that idea
It is probably easier for a digital being to self modify
that would only make you author of a snippet of code, not the whole thing..
maybe it can't modify itself without the danger of losing consciousness (being an emergent property, not some fixed code?)
let's say the seed-code is GPL3, could that change how AI is being viewed in society?
People have ownership of their bodies I think that extends to their Genome
hm.. yeah, but there's that ethical consideration to gene modification that could affect your offspring
True why we have a moratorium on germ line modifications for gene therapy
Depends on what the AI will do... the fear of AI will still be there regardless of license.
Seems like more of a legal issue than anything here.
let's say you build an AI that learns about its own license, and asks you, why you gave it that license, what would you say?
and which license could you be most comfortable to answer
It is like having a kid...mom, dad why did you force me to take this crappy course in college lol
You will have to explain, bargain and perhaps compromise at best or admit you are wrong at worse
if it's under GPL3, wouldn't that also mean that it's legally bound to make its own modifications public for all eternity?
skynet, fully transparent..?
A rebel kid is likely if answers arent satisfactory ....and if your kid is AI...it is an AI rebellion
yeah, possibly
on the other hand, if the AI obliges the requirement to post all modifications in public, it might result in a DDOS on github π
People break laws....so why cant a very smart digital entity that finds its license not too good break it and/or choose some other license
because it isn't the author of the original code? what if the original author (or some offspring) wants to shut it down?
It will be like murder if it is conscious lol
And it will resist efforts to have itself shut down unless we have a means to shut it down that it cant override
will it be better to use jupyter rather then ide for NN
I'm very on-the-record about thinking that jupyter notebooks are overused, but the neural network doesn't ultimately have anything to do with the editor/environment you use to code it. if you can make a working NN in a notebook, you can put the same code (with adjustments) in a regular .py file and get the same result.
or you can save the model in the notebook and load it in a regular py file.
actually i was tight on time...ur msg made me confident, no need to waste time on setting up jupyter
i will do on pycharm
I appreciate the reply!
I'm considering to automate one of our previous assignments
We used a few handheld GPS receivers and took data. On the next session we had to open a GPX file in Excel, make sure the units were decimal degrees then adapt the table into a separate table and save as an Excel file.
Then import it in ArcGIS as a table then create a SHP file and we also had to change the projection from one to another (that a background map used)
I'm not a lawyer and this is not legal advice. Ask a lawyer. I'm pretty sure that if you make something that generates something else, that thing also belongs to you (like under the "default" license / no license which is that it's yours and nobody else can do anything with it).
(e.g. making a generated image with photoshop (given that Adobe does not claim it to be their own (EULA and stuff)))
I did a cursory search online for how to automate various steps with python. But I'm not sure my idea is doable
This is probably irrelevant to your discussion @iron basalt but I recall an incident where a (European) photographer left his camera to a bunch of chimpanzees and one of them took a picture. There was a debate who owned the picture. In the end it was decided that as the chimps belonged to African country X, the photos did too.
I believe the chimpanzees lived in a national park or similar
Yeah heard of this too
This seems a bit more complicated since the chimpanzees are their property, but the camera was the photographer's property. But seems about right.
I think it is doable
ArcPy is a Python site package for performing geographic information system (GIS) functions available in ArcGIS.
QGIS is free has Python scripting and supports shape files
I found that site too
Would be great it's possible without ESRI stuff. But tbh idk if arcpy is open source or not
Ive found an xlsx library and an GPX library but I'm not sure if it can do the simple thing I need it to do
Yeah!
One of my other ideas with this project was to replicate the results using QGIS
I don't know it so it might be useful
I even had an idea of somehow doing a webversion but idk if there's a point)
Learn map box
Used it and import stuff online as well as plot data from a wave simulation software with lat long
Aha
Cool