#data-science-and-ml

1 messages · Page 314 of 1

cedar sun
#

and what will happen with the conexions?

#

i need to retrain it,right?

novel elbow
#

yes

#

you can freeze all previous layers and just train the last one

cedar sun
#

uuuuuh

#

no but

#

i get my model with load_model from keras

#

i need to load the model, remove the last layer, add my own, freeze, and train?

#

it shouldnt take too long, right?

novel elbow
#

yes, should be faster than training all the model and you don't need many epochs as you are only optimizing one layer

cedar sun
#

thanks thanks

upper spade
#

guys where to learn pandas

pallid cliff
#

hi here,
I'm trying to do some stats on a pd DataFrame on some products,
I have a column store which is a list of string : ['carrefour', 'auchant', 'bi1', 'wallmart', ...] stores where that product is sold
and a column calories : float number of calories in that product
I want to rank stores based on the average calories of the products they sell
can someone help me ?

near cosmos
pallid cliff
# near cosmos It'll be something like (from memory so something might be a little off) ```py ...

it's not really working,

df.groupby('store')['kcal'].mean().nlargest(10)
[['Wholefood']]                                                                      3830.0
[['Costco']]                                                                         3779.0
[['Super U', 'Magasins U', 'Woolworths', 'Coles']]                                   2384.0
[['carrefour market plouagat']]                                                      2000.0
[['Biocoop eau vive']]                                                                900.0
[['Bo nature et santé']]                                                              900.0
[['Carrefour Market', 'Leclerc', 'Systeme U', 'Auchan', 'Casino', 'Intermarché']]     900.0
[['Carrefour', 'houra.fr', 'Magasins U']]                                             900.0
[['Carrefour', 'intermarché']]                                                        900.0
[['Carrfeour', 'Auchan', 'Leclerc', 'Systeme U', 'Casino', 'Monoprix']]               900.0
Name: kcal, dtype: float64

it's not grouping the way I want. See how "Carrefour" appears in multiple rows

woven surge
#

Hi, so I'm looking into config files and I have one, but it generates based off of my main.py file which explicitly defines the data structures. I want to be able to modify the config and have that reflect in my main.py file. How would I do this?

What's happening now:
main.py creates config.ini w/ pre-defined data-structures (config.ini = hardcoded)

What I want:
modifyable config file which main.py retrieves information from and uses to perform operations

near cosmos
serene scaffold
#
# what you have
[['Wholefood']]                                           A
[['Costco']]                                              B
[['Super U', 'Magasins U', 'Woolworths', 'Coles']]        C

# what you want
'Wholefood'  A
'Costco'     B
'Super U'    C
'Magasins U' C
'Woolworths' C
'Coles'      C
oblique raft
#

Can someone recommend me a good tutorial for generating text with a rnn with keras ?all tutorial I've tried just don't work for me... Thanks in advance

oblique raft
#

Interesting...

#

Python 3?

grave frost
woven surge
oblique raft
#

Spam

sly salmon
#

Hey guys, Q about gradient descent.

When utilizing gradient descent, what is the function we are "descending"?

Is it the Loss (y axis) vs Feature weight (x axis)? If so how do we find this function?

Also for a neural network, is gradient descent done for each individual weight? E.g. for 20 lines connecting nodes, 20 gradient descents are performed on each weight to find the weights resulting ineast errors?

desert oar
# sly salmon Hey guys, Q about gradient descent. When utilizing gradient descent, what is th...

Is it the Loss (y axis) vs Feature weight (x axis)?
Yes, but keep in mind that there can be many feature weights in a big neural network, potentially thousands or millions. So x can be a very high-dimensional vector.

If so how do we find this function?
Either derive it by hand, or use an "automatic differentiation" software package to compute it for you.

Also for a neural network, is gradient descent done for each individual weight? E.g. for 20 lines connecting nodes, 20 gradient descents are performed on each weight to find the weights resulting ineast errors?
No. The "gradient" is kind of like a vector-valued derivative. You update the entire weight vector in one step. Updating each weight individually is called "coordinate descent", which is used in some models but usually it's not important from the user perspective.

Gradient descent is effectively an implementation detail. It happens that neural networks are so difficult to optimize that tuning the optimizer is a necessary part of training them. With most other models and optimizers, you don't have to tune the optimizer in day-to-day usage (e.g. logistic regression with L-BFGS).

simple shadow
#

hi! i was wondering how do i iterate over rows to find a specific value that meets a certain condition

grave breach
#

Just do:
for row in rows:
if <whatever>:
do_something()

#

Or maybe you meant something like anomaly detection?

#

(for that you need ai)

#

@simple shadow

grave frost
#

technically tho, even using if-else is A.I

desert oar
#

@grave breach this channel is kind of the catch-all channel for pandas, numpy, matplotlib, and scipy

grave breach
#

Sorry, didn't got he meant pandas' rows, I thought he just had a list of lists

desert oar
#

ah yeah. you'll start to see common patterns in people's XY questions so you can skip some of the back and forth "what do you mean" stuff.

near cosmos
#

I didn't see that in there either

simple shadow
#

@desert oar i want to change specific values in one column

near cosmos
simple shadow
#

yes @near cosmos

near cosmos
cedar sun
#

@serene scaffoldhello. He didnt reply u yet, right?

serene scaffold
cedar sun
#

huh?

#

what do u think i want? i mean, i was looking for like 50+ images of each pokemon

uncut barn
#

For a dissimilarity measure to compare ratings such as very dissatisfied, dissatisfied,
neutral, satisfied, very satisfied should it be 1 hot encoded and then use the hamming distance?

#

or can I use euclid's distance

gentle lion
#

hey i'm using tensorflow.keras to train a CNN, but for some reason it doesn't show anything or do anything after model.fit

#

i basically removed all my layers but still nothing

cedar sun
#

I need to gather a big image data set of pokemons to make a cnn classifier. If any of u wanna help me, pls check google image search python api, and follow the steps. Ping me too to share a script

gentle lion
#

it just prints 1 and keeps running forever

#

any idea why it might be?

#

i even have verbose to 1 so it should print some epoch info

sly salmon
# desert oar > Is it the Loss (y axis) vs Feature weight (x axis)? Yes, but keep in mind that...

Thank you for the awesome explanation. I am getting a grasp of how the cost function updates the weights of all the weights instead of doing them individually.

I have a few more qualms, if you don't mind.

  • How do we actually know the formula of our cost function (e.g. in the form y = mx + c)?
    My current intuition is that we set our cost function from the get-go, e.g. least squares. So that's how we know. Is that correct?

Also, here is how I think gradient descent works: Please let me know if this is wrong

  • We have a point on our cost-function, (x, y, z) which is the weights of our x, y, z features

  • We partially differentiate our cost-function, and sub in values for x, y, z to find the gradient at the point specified in the previous step (the rate of change of the loss function in respect to all features)
    so, something like [1, 2, 2]

  • We then go down this gradient, by updating the weights for our x, y and z features at once:
    weights = [5 (weight of x), 4 (weight of y), 3 (weight of z)]

# multiplying our weights with the gradient of the weight
[ 5
  4    x  -[1, 2, 2] = multiplied_weights
  3 ]

new_weights ([new_x, new_y, new_z]) = old_weights - training_steps * multiplied_weights

We recalculate our cost, and carry on, until it's a minimum.

The hardest part was thinking of each feature as a vector. I really appreciate your help. If you have a bitcoin address let me know.

sly salmon
#

also if I'm finding a cost function for a neural network, how many iterations should I go through (adding the cost to a cost_sum variable before I divide by the number of iterations to get the avg. cost)?

fiery cipher
#

hi, I have a data set a mix of int and string , the type is numpy.ndarray am trying to detect the string attribues do a condition I used if isinstance(point, str)== True but it doesn't seem to work

near cosmos
fiery cipher
soft silo
#

Hi guys, is someone experienced with scipy here? I'm trying to solve a set of two differential equations and i need some help to verify if it's correct

fiery cipher
near cosmos
fiery cipher
#

the centroids are random points from the data this is how they look

cedar sun
#

guys, using seaborn, how can i make a barplot without legend on one axe?

near cosmos
cedar sun
#

it doesnt allow me

#

x = sns.barplot(list(dct.keys()), list(dct.values()), x=None)

#

so

#

dct.keys() is what i dont wanna display xd

#

so basically what i wanna display is

#

string1 = 80, string2 = 120

#

etc

#

the values of each one

#

but i only want a vertical axe having numbers like 25-50-75-100

#

do i explain?

leaden meteor
#

boa noite

near cosmos
cedar sun
#

i dont want dct.keys to appear on the legend

arctic crown
#

please help

warped cave
#

what do neat-python outputs coorespond to and how do I know what to do with them exactly?

desert oar
# sly salmon Thank you for the awesome explanation. I am getting a grasp of how the cost func...

My current intuition is that we set our cost function from the get-go, e.g. least squares. So that's how we know. Is that correct?
Yes. You choose your loss function f(y_true, y_predicted) and plug in your model for y_predicted. So if your model is linear regression, the loss function is f(y_true, ax + b).

I'm not sure I fully understand your example of gradient descent. Yes, the gradient at a specific point is the vector of partial derivatives evaluated at that specific point. You don't perform weight updates multiplicatively, you do them additively. I recommend looking at the equations, you might be surprised at how simple it is.

Our server rules disallow offering or requesting payment for help, but I wouldn't accept payment anyway.

desert oar
severe valve
#

Does anyone know of any complete beginner tutorials that introduce Keras? I really want to get into ML a bit more but every tutorial I find is absolute trash when it comes to explaining the code. For the most part, other than some deeper level ideas, I understand most ML concepts. That isn't the issue. However, understanding what I'm writing down and not just mindlessly copying it is my problem. I have no idea what the code does or how to use it without almost completely rewatching an entire tutorial. And a lot of the tutorials that I do end up finding are very vague and tend to just read commonly available materials which I have already gone through. I'd really appreciate any and all help I could receive, thank you!

autumn basin
#

@severe valve the keras documentation is honestly the best for this. Tutorials tend to abstract away from what is going on under the hood.. and over abstraction leads to the confusion you are talking about. The documentation is dry, but it will leave you with an accurate understanding of how to use the API.

near cosmos
#

You are passing a file object, instead of a string, to tokenize_words. You need to .read or similar from the file to get the contents

severe valve
gentle lion
#

can anyone explain this? im using CNN with keras and it prints epoch 1/10 and then lags for a bit and quits

gentle lion
#

basically when i add a conv layer to my network it stops working

#

any help is appreciated

subtle panther
#

Can anyone tell me please do i need to have a good understanding of mathematics in order to learn ML??

desert oar
#

@subtle panther you should intend to expand your math knowledge and understanding in parallel with your hands-on experience and your programming knowledge

#

So yes, eventually. But you can get started without knowing lots of math up front

#

If you already know calculus and the basics of linear algebra (vector/matrix math and the interpretation thereof as systems of linear equations) it will help

distant phoenix
#

Hello guys!1
Could you help me to create some script?

Note that I'm using 'Selenium Webdriver' and now I need to get all these values (red square) and put then in a List, for example:

[5-2021, 4-2021, 3-2021, etc...]

If I could get its values in a list, I could create a big web scrapping.

#

I can already click on each field, but I wanna put it on a loop, where the RPA will select the first value, click and get all table values... after it will clicks on the second and get all table information

lapis sequoia
distant phoenix
#

I'll search for it and learn how to use. Thank you a lot!!

lapis sequoia
#

You’re welcome 🙂

red hound
#

Is there a list somewhere that shows all the depreceated and equivalent functions from tensorflow v1?

subtle panther
#

@desert oar thanks I understood properly what I need to do

distant phoenix
# lapis sequoia You’re welcome 🙂

I've import this lab as you suggested, but I've tried many ways to extract the information.

'div', class_="form-control custom-select"'
'div', class_="drop-meslme"'
'div', class_="col-12 col-md-5 col-lg-4"'
'select', name_="meslme"
etc...

I can't find any fild on red square as well.

#
#

and the values are in the "combobox" as in following picture

woven kayak
#

Hi, someone knows how use an audio output as signal generator in google colab?
I try to generate a signal v = E.sin(wt+phi) + Vbias but the function IPython.display.Audio kills my bias.
Thanks in advance!

torpid ember
#

hey when building analytical models, what is being referred to when someone says "have you done the business rules for this model"?

#

someone at work is asking me and im scared to answer because idk

#

does he mean documentation

#

TELL ME YOU HAVE IMPOSTER SYNDROME WITHOUT TELLING ME YOU HAVE IMPOSTER SYNDROME

grave frost
lapis sequoia
#

I want to make a project analyzing programming language popularity by developer type based on the data contained in the Stack Overflow 2020 Developer survey.

I thought about creating a separate DataFrame for each dev type, then calculating a percentage for each language each dev type said they worked with, but it sounds like too much work for something that surely has a simpler solution.

#

any ideas?

arctic ice
#

how can I use opencv to scan what the camera sees and than give a 3d digram of that to the computer

agile sinew
#

I'm streaming some time series data from Kafka using Spark structured streaming of 10 seconds but sometimes streamed data not contain 10sec .. any solution?

#

thanks in advance

desert oar
lapis sequoia
#

Hello guys, I have a question how do I calculate the effect size in a chi quare test. (Outputs I should get are: Phi & Cramer-V)
Thank you for help

cedar sun
#

what is the nn that codes for u?

snow cliff
#

hi guys can anyone help me out please im sorta really stuck. im trying to draw a sphere onto processing but anytime i use P3D, the display image is just grey and blank. does anyone know why?

grave frost
covert adder
#

Anyone availible to help me with Intro to Data Analysis for Python?

median ember
#

thanks, but doesn´t work on multiple items on B, I solved it, it was quite tricky

velvet thorn
#

you would need a different approach for that

#

but your original example only had one row

median ember
velvet thorn
#

but if you’ve solved it then gratz 🏆

desert oar
#
import numpy as np
import pandas as pd

data_in = pd.DataFrame([
    {'x': 3, 'y': 2, 'z': 1, 'data': [0.1, 0.2, 0.3]},
    {'x': 2, 'y': 0, 'z': 4, 'data': [0.7, 0.8, 0.9]},
])

shape_out = (5, 5, 5, 3)
data_out = np.zeros(shape_out)
for row in data_in.itertuples():
    data_out[row.x, row.y, row.z, :] = row.data

Is there a way to do this using Numpy fancy indexing, or something otherwise vectorized, without looping + itertuples?

#

Naively I had tried

shape_out = (5, 5, 5, 3)
data_out = np.zeros(shape_out)
data_out[data_in['x'], data_in['y'], data_in['z'], :] = data_in['data']

But I got the ValueError: setting an array element with a sequence. error

velvet thorn
desert oar
#

oof, really?

velvet thorn
#

you get that problem because you have an array of lists

#

in data

#

data_in['data'].tolist() this converts it into a list of lists

#

which numpy will treat as an array

#

you need either an array or list of lists, not a mixture

desert oar
#

huh, i figured it wouldn't matter if it was a list or an array-like thing

#

yep that worked perfectly

data_out2[data_in['x'], data_in['y'], data_in['z'], :] = data_in['data'].tolist()
velvet thorn
#

it doesn't as long as it's homogenous

desert oar
#

@fallen trellis see above

velvet thorn
#

also you can leave off the final :

#

but that's not very important

desert oar
#

yeah i like it for visual clarity

desert oar
#

i.e. not dtype='O'?

velvet thorn
#

just not an array of lists

#

because then numpy treats it as having length 1 across the relevant axis and tries to broadcast across it

#

which leads to trying to put the list in individual slots in data_in

#

hence setting an array element with a sequence (said list)

desert oar
#

yeah that makes sense, it doesn't know what the objects are in the array so it treats them all as scalars

#

good to know

median ember
cedar sun
#

so

#

how bad this is

#

this folder is supposed to have only bulbasaurs

#

but api randomly grabbed an ivysaur, the evolution

#

how many fails like this are a problem for a neural network?

#

Like, if i have 100 images, how many failures can i afford?

#

trying to avoid data clean :)

covert adder
#

Please help!

Generate a vector of 1000 random numbers between 0 to 100.•Plot a histogram of these numbers with number of bins equal to 10.•Calculate the average of these numbers by using numpy method mean().•Plot a red line (red color) from the mean point on the histogram plot in y direction to show the mean location in the plot.

import numpy as np
import matplotlib.pyplot as plt
data = np.random.randint(100,size(1,1000))
print(data)
matplotlib.pyplot
plt.hist(list(data)),range=(0,100),bins=10
mean = data.mean()
print(mean)

exotic maple
#

what is your issue?

covert adder
#

I did the homework. I just keep getting an error.

exotic maple
#

please share that error

covert adder
#

NameError: name 'matplotlib' is not defined

exotic maple
#

uh, have you installed matplotlib?

#

you need to install the library first before using it

#

pip install matplotlib, or if using CONDA distrios, conda install matplotlib

covert adder
#

it is installed

exotic maple
#

then at that point you need to make sure you're using the right PATH

#

that is, that your Python install is pointing to the right direction

#

unfortunately that's something you need to do yourself

#

sometimes rebooting works, but first you need to make sure Path is ok

covert adder
#

thank you

desert oar
#

there are some techniques that are specifically designed to adjust for mislabeled data, e.g. Gold Loss Correction which i've used with some success in the past

#

this also isn't specific to neural networks. the same reasoning applies for pretty much all statistics and machine learning

cedar sun
#

mmmm

#

just for in case u know

#

if instead of searching "bulbasaur"

#

i search "*bulbasaur*"

#

with *

#

will i increase my ocurrences of bublasaur?

#

like in a normal user search. * mean only that

#

or thats what i think

desert oar
#

i have no idea, it depends on what exactly you're searching

cedar sun
#

nah, nvm, dont worry about it

#

i will search for a few images, and take a fast look, and if i see many missplaced imgs

#

i will look for that loss u mentioned above

tidal bough
#

just at the screenshot you provided, there's also a picture with a faraway view of a street, and one with 3 pokemon

#

so a different evolution that looks quite similar isn't a big problem comparatively

cedar sun
#

so

#

the street is a bigger problem?

#

lmao

#

This is the street image

#

there is a bulbasaur technically

velvet thorn
#

hm I think that can be improved

#

but I'd need to have a more complete example

#

and if what you have works for your purposes then might as well go with it

median ember
median ember
#

jesus, I sent you the wrong code

#

oh no, it was the right one

#

sorry

severe valve
#

anyone ever feel like they've hit a wall when it comes to learning ML/NN? I really want to learn a lot about these fields so I can apply them to a future job in medical research but I just can't sometimes. It gets so boring and blunt. Everything feels too complex and involves so much math but I feel like if I don't learn ML/NN, then I won't be sought after job-wise. Data analytics, visualization, etc only takes you so far. Even if I try and push through this, all I get is a bunch of information that I can't apply leading me to rewatch the videos and get stuck in an endless cycle.

desert oar
#

My advice: ditch the videos, use textbooks, spend some time learning the math.

#

With a good textbook, doing some exercises at the end of each chapter can be very important for learning.

#

At the same time, just start messing with data.

#

Do a bit of math, then forget all about math and just make some pretty plots, or fit some models.

crisp ruin
#

sick name bro

severe valve
#

so i've basically just been doing that last part, i've just messed around with a lot of models. but I've had absolutely zero idea what the model does other than on a high level. ( E.g CNN ~ image classifier. Linear regression ~ linear problem. etc ) But so far that hasn't gotten me very far and when I get errors or don't understand why my model is performing so bad I just have to stop because I have zero understanding of the subject.

#

and then when I go to actually learn it just becomes more and more difficult.

#

But I'll try and find some textbooks if I can and try doing the math

merry ridge
#

I have a friend that easily started a summer job that eventually turned into a part time, then full time position just blind applying to the clinical research unit at my university with just a bachelors and no experience in medicine. It really helped her learn machine learning in a more meaningful way, but the Math was unavoidable. Her first month there was just reading a textbook on Markov chains.

severe valve
#

exactly my point. I'd really love to apply ML ( as I learn best through application. I initially struggled with this in other programming languages before I found python ) but even when I try to apply it, it all just breaks down in front of me. But I guess the concepts behind ML are the most important for now, I'll definitely go look into textbooks for ML. Thank you everyone for your advice and time. :)

merry ridge
#

I think this is a commonly mentioned book, I read Mathematics for Machine Learning and I felt like it was a very pleasant read and covered a good breadth of material. I probably wouldn't use it until you've had at least a first course in calculus and linear algebra though.

vapid patrol
#

i am currently reading that book too, its a free book

hard hound
#

There also a great book on ML by Ian goodfellow and and Yoshua Bengio

lapis sequoia
#

//*[@name=“meslab”] like this

zealous tulip
silver widget
#

hi guys. got a question about kaggle house prices data; been investigating other solutions to improve my code and perspective. see sth like that

Getting the correlation of all the features with target variable.

(train.corr()**2)["SalePrice"].sort_values(ascending = False)[1:]

what is the reason for using train.corr()**2?

#

ops sorry double * makes the code bold

#

pow(train.corr(),2) is better to write here

#

oh silly me.. got it tnx anyways guys

sly salmon
#

For neural networks, how do you get the partial derivative of the cost function?

For tensorflow models, is this hardcoded depending on the cost function you choose, like MSE?

polar stag
#

is IBM data science certificate good on coursera? or you people can recommened me some good one, to mention, i'm new to data science.

distant phoenix
fallen trellis
desert oar
fallen trellis
#

Unless you iterate over the entire dataset, no

desert oar
#

Ah, then no

fallen trellis
#

Even if, how would numpy handle loading the 5gig+ dataset?

cedar sun
#

the problem isnt numpy

#

the problem is ur ram xd

#

u can read from a buffer i believe

#

Lets say, first 256 Mb of the data set, then the other 256, and so on

noble drum
#

the more giant data you have to buffer, the more you should consider Dask

digital aurora
#

R u into software engineering?

noble drum
#

I only dabble.

digital aurora
#

I see!

light merlin
#

Where would be the best place to learn neural networks (preferably in python) for something like facial recognition?

velvet thorn
#

also, memory mapping

grave frost
#

5 GB is not that big
cries in 2Gb of memory

sly salmon
#

Hey guys, gradient descent Q

I read this:

A larger learning rate leads to a faster learning process at a cost to be stuck in a suboptimal solution (local minimum). A smaller learning rate might produce a good suboptimal or global solution, but it will take it much longer to converge. In the extremes, a learning rate too large will lead to an unstable learning process oscillating over the epochs. A learning rate too small may not converge or get stuck in a local minimum.

I don't get it.
A larger learning rate may mean that you miss the global minimum and end up somewhere else, but why does it mean you are stuck?
while with a tiny learning rate, won't you most definitely be stuck in the first local minimum you get into?

velvet thorn
#

first question, possibly, depends

#

you might diverge

sly salmon
#

diverge? as in, miss the global minimum?

velvet thorn
#

no

#

diverge meaning increase without bound

sly salmon
velvet thorn
#

yes

sly salmon
#

hmm 🤔 why would that be, wouldn't you always go down the path of the negative gradient, thus moving towards a lower loss all the time?
could you give a possible scenario for this? My idea would be a function like y=x^2, where if you have a large learning rate you always overshoot the minimum, but I don't think that explains the loss increasing indefinitely

tidal bough
#

if you have a large enough learning rate, you jump from x to -x -a for some positive a

#

there the slope is higher, so next you jump to x + a + b where b>a...

#

and continue bouncing off the walls of the parabola, getting further and further away from the minimum

velvet thorn
#

bouncy

#

...sorry I'll keep quiet now

sly salmon
#

so in gradient descent, how do you actually know that you reached the minima - the gradient vector's parameters will all be 0? (so the gradients in each axis are 0 thus it's a minima)?

tidal bough
#

Well, yes, that is basically the definition of a minima, though a simpler way is just checking that you haven't moved much this step

sly salmon
#

ok gotcha. I also didn't fully understand why stochastic gradient descent is less susceptible to getting stuck in a local minima compared to batch gradient descent.

iirc, the formula for updating our weights is proportional to our losses, so:
new_weights = old_weights + learning_rate*(negative_gradient_vector * loss)

If stochastic gradient descent is less likely to get stuck in a minimum, that means that the loss has to be greater? But why is that the case? Surely, if you take the loss of one of your predictions (instead of your whole dataset), you are not guaranteed to have a greater loss so I would think it's unfair to say it's less likely to get stuck in a local minima.

Maybe you get lucky and SGD chooses a random point with a loss that is greater than your whole dataset's loss. Then I can see why it's less susceptible to getting stuck. But still, it's a bit "random" and is a chance. Is this why people say that?

tidal bough
#

I'm not sure about this, but it might simply be that since it's nondeterministic, it'll eventually luck into a path out of a local minima, unlike deterministic ones that are definitely stuck

desert oar
#

what do you mean by "stuck in a minimum"?

#

you want to get stuck in a minimum, that's the whole point of doing gradient descent

sly salmon
#

i mean, a local minimum which may not be the global minimum

desert oar
#

that's different from "not converging", which is what reptile was talking about (and what you were asking about) with batch gradient descent

#

gradient descent only ever finds local minima

grave frost
#

I don't know, but a lot of things in ML are not theoretically backed - it's just found that in practice x works better than y and so on

sly salmon
#

oh really?

#

I was just trying to think about how we can get ourselves out of a local minima and continue to a global minimum

#

so, when people say "stochastic gradient descent is less susceptible to getting stuck in a local minima", what does that mean?

desert oar
grave frost
#

just curious, then how do we do that? adam and such?

desert oar
#

so the idea is that it's less likely to get stuck in a small local minimum because it might just skip over it

#

that's my understanding, at least

#

but ultimately it's still finding a local minimum, there's no guarantee (that i know of) that it's a global minimum

sly salmon
#

ah I see, so just due to the random nature of SGD, it can randomly pick a prediction which has a high loss and makes you skip over say, a local minimum, and you might then get to a global minimum

#

but for batch, stochastic and mini-batch, they essentially all just converge at the first local minimum they find (most of the time)

#

so yeah, as @grave frost, what would you use then to find the global minimum?

#

what if the neural network re-runs with different initialized weights, multiple times, to try to find the global minimum

desert oar
#

you won't ever know that it's global

sly salmon
#

yeah, I was thinking that, so there's essentially no way to know if its a global minimum?

#

or maybe you can differentiate the cost function and find each turning point, then you'd have an idea of which areas to check and one of them will be a global minimum

desert oar
#

you can't ever know. you can compare loss values at 2 different local minima, but that's it

grave frost
#

but if that is indeed the case, then why is it that changing the seed of the model does not yield much of an accuracy difference? does this imply 9/10 times a model does find a global minima?

sly salmon
desert oar
#

yeah, because realistically there aren't that many minima, or different initializations don't have that much of an effect on which minimum is chosen

grave frost
#

so even if different initializations get stuck on a local minima, then what? I use something different?

desert oar
#

but you don't and can't know that they are local, non-global minima

grave frost
#

would tell me I was in a local minima, doesn't it?

sly salmon
#

so why can't you just differentiate the cost function to find all the minimas then compare the losses between them? Because some cost functions can have infinite amount of minima?

desert oar
#

the derivative of the cost function is the gradient

#

gradient descent is how we attempt to find a minimum

sly salmon
#

hmm, so the cost function is not in the form (an example) y = f(x)?

desert oar
#

huh?

#

back up

#

what do we do in order to fit a model

#
  1. define a loss function
  2. minimize the loss function
#

right?

sly salmon
#

yes

desert oar
#

so how do you propose to find all minima?

grave frost
desert oar
grave frost
desert oar
#

brute force how? re-initialize at 1000 different points and re-run gradient descent for each one?

sly salmon
#

well... I might just be spouting rubbish... but, if you had the cost function y = f(x),
can't you differentiate it to get the gradient of each axis?

I guess then you would have to sub in numbers into each derivative so that all derivatives equal 0, and that would find you minimas

desert oar
#

can't you differentiate it to get the gradient of each axis?
the gradient is the vector of partial derivates

sly salmon
#

yes

grave frost
desert oar
cedar sun
#

how good is downloading a model that seems to work, download some random images cuz the data set used to train that model is gone, use that model to clean the data i downloaded, and use this cleaned data to train model for better results?

desert oar
#

realistically models probably don't have that many minima

sly salmon
sly salmon
grave frost
sly salmon
desert oar
#

(which i am not sure is even possible)

grave frost
#

then? wouldn't brute forcing be faster?

desert oar
#

brute forcing how? computing the derivative at "every" point?

grave frost
velvet thorn
#

you’re basically talking about a grid search over the whole feature space

#

computationally intractable

desert oar
grave frost
#

hmm...have NN's been tried to find faster alternatives to SGD?

sly salmon
#

good talk, this community rocks 😎

desert oar
#

@sly salmon derivative == 0 just means it's locally flat, could be a saddle point

velvet thorn
desert oar
#

and yeah i think that's what they're proposing - solve analytically for all roots of the derivative and compare the loss at each one

velvet thorn
#

you can’t do that because

desert oar
#

i assume that's not possible

velvet thorn
#

the function is overdetermined

#

like

#

okay imagine you have

sly salmon
velvet thorn
#

3x + y = 6
x - y = -2

#

you can solve that

#

but if you have

velvet thorn
#

3x + y = 6
x - y = -2
x + y = -3

#

there’s no consistent solution

#

to all those equations

#

now remember that

#

each set of feature values and target

cedar sun
#

this is

velvet thorn
#

forms one such equation

cedar sun
#

rouche-frobenious (?)

#

or something like that

velvet thorn
#

and you often have many more data points than features

grave frost
#

regrets not learning fully about SGD

velvet thorn
#

think about linear regression

#

you can’t draw a line

cedar sun
#

In linear algebra, the Rouché–Capelli theorem determines the number of solutions for a system of linear equations, given the rank of its augmented matrix and coefficient matrix.

velvet thorn
#

that goes through all points, right?

#

same concept

sly salmon
velvet thorn
#

(basically)

sly salmon
#

so we have to find it via some exploratory technique with gradient descent?

velvet thorn
#

yes

#

it’s late so I won’t go into the details but

#

think about it this way

#

take a piece of cloth

#

no matter how you contort it

grave frost
#

but...can't we just solve for each 2 and average the solutions?

sly salmon
#

alright, I appreciate it. really good talk I learnt a lot today!

velvet thorn
#

it must have a minimum

#

a “lowest valley”

#

it’s a physical necessity

sly salmon
desert oar
velvet thorn
#

they do not meet at any single point

sly salmon
#

okay, how does that relate to minimas?

#

or is that just an analogy

velvet thorn
#

there are several points to make

#

okay let’s continue this another time?

#

bedtime for me

sly salmon
grave frost
#

gn!

sly salmon
#

also, what do you guys mean by solving the cost function "analytically"? I've never heard the term before

#

Analysis is the branch of mathematics dealing with limits
and related theories, such as differentiation, integration, measure, infinite series, and analytic functions.These theories are usually studied in the context of real and complex numbers and functions. Analysis evolved from calculus, which involves the elementary concepts and techniques o...

#

i guess this answers that question

desert oar
#

@sly salmon "analytically" means finding an exact solution by solving equations

#

i.e. "set the derivative equal to 0 and solve for x" is the analytical solution

#

as opposed to the numerical solution which doesn't require solving for the exact form

#

"analysis" in the sense of "real analysis" is a different thing

sly salmon
#

i see. and yeah, that example given before:
3x + y = 6
x - y = -2
x + y = -3

that simultaneous equation could essentially be replaced by all my partial derivatives, and it may be impossible to find a consistent solution. Ig I could use it as an analogy to say that "there are no consistent values where it's a minimum", so we have to take the iterative "gradient descent" approach.

#

but if I do it that way, essentially I'm saying all of my gradient vectors will never meet at one point? Thus they are never going to equal the same value (0) where there's a minima? < I might be wrong there.

But then the question lies...
If there is no consistent solution analytically, how can there be a solution iteratively (via gradient descent)? Or maybe the answer is just an approximation, hmmm.

limpid oak
#

Hello friends

#

need some help

worldly ruin
#

Anybody know if pandas has expressed intent to port the package to arm for m1 macs?

somber prism
#

can someone explain me why variance in ml referring to overfitting but in statistics its measuring the how much the data is spread from the mean ? i am little bit confused 😐

limpid oak
#

I'm inserting data into DB using json files

#

with os.walk() but after some times speed decreased

#

any solution for this

desert oar
#

@limpid oak you should ask this question in a help channel, and provide the code that you are using

limpid oak
#

thank you @desert oar

somber prism
desert oar
#

in the context of model overfitting/underfitting, people usually refer to the "variance" of the entire model-fitting procecure

#

imagine that you could randomly re-generate your data over and over, then fit your model on each version of the data

#

then you would have a probability distribution of models, more or less

#

the definition of "variance" never changes

somber prism
#

oh ok thanks

tulip ridge
#

hey there.. anyone knows how to develop algorithm using python3

lapis sequoia
#

I'm trying to get some data from wikipedia but wikipedia's data is so dirty so is there an easy way to clean it or is there another cleaner alternative to it?

bronze skiff
#

aren't there like a billion wikipedia datasets out there

#

just google around

lapis sequoia
#

no

unkempt lion
#

(ping me when u respond cuz ima be afk coding)

unkempt lion
main kernel
#

no

teal wadi
#

hello

#

how do i get permission to talk ?

late shell
#

Hello, can someone help me in simple linear regression. I have a feature total_spend and target sales. Now I scale this data and train my model and get the estimates for beta0 and beta1 such that Y = beta0 + beta1 * total_spend. But the beta's I have right now are estimated for the scaled data, so its somewhere between 0 and 1. But this is a problem because I cannot use these beta's for inference i.e to study the affect on sales by a one unit increase in total_spends. So how do I get my beta's back to my original scale?

ripe forge
#

Save the scaling step as well. You have to apply the same scaling on inference also

#

Otherwise your model is pointless. It must be fed data with the same scaling for both train and inference

#

Once you do that, you'll realise your model is more like Y = beta0 + beta1 * f_scaling(total_spend)

#

That should let you do any analysis as you see fit

grave frost
#

yeah, I would have chimed in to make a custom preprocessing layer if your pre-pro gets a but complex - but def not for linear regression

lapis sequoia
#

any idea how can you sort an np array with strings that follows the same kind of sorting of linux file systems?

serene scaffold
sly salmon
#

Say you had a 1000 simultaneous equation with 20 variables. would solving each equation for a consistent solution be insanely computationally hard and long?

velvet thorn
#

more like like

#

it depends

velvet thorn
lapis sequoia
#

hi

hallow sundial
#

Is anybody free for a short call about a few questions about datascience and AI?

bronze skiff
#

just post your questions no one wants to be called

ruby peak
#

YEs

trim cobalt
#

I have a quick question

#

Could I ask for some help with coming up with ideas for a future project. I am not a very creative person but I want a fun project to do with AI and CV

#

please ping me as I have this server muted

merry ridge
#

Can anyone explain what is going on here? I can't figure out why my dataframe is giving me the wrong length.

main dome
#

the indices skip some numbers

merry ridge
#

I realized it just as you typed that

#

Thank you, I spent way too much time looking for something else

main dome
#

rippp

mint palm
#

first time doing on pycharm instead of jupyter ....the output it correct but theres bunch of following red text...is it nothing to bother...?

polar stag
#

guys, i'm new to this data science and i'm serious to have a career in it. can you guys suggest me books/courses or any vids to start it?

robust charm
#

Has anyone here used the dlib library? Im try to make a face recognition program and im having a little issue

short heart
#

The perfect amount of epochs would be the one that ends with the minimum loss?

late shell
#

Hello, while doing a simple linear regression, using just 1 feature. My MSE keeps on increasing a lot by each epoch until python gives out overflow error. What could this mean? Why is MSE increasing?

limpid oak
#
  try:
    shcSurveyNo = shcSurvey.split('/')[0] 
    
#     villDF['name_match'] = villDF['PIN1'].apply(lambda x: 'Match' if x==shcSurveyNo else 'Mismatch')

    if shcSurveyNo in villDF['PIN1'].unique():
      print(shcSurveyNo,'Yes')
      villDF['shc']=1
      
    else:
      villDF['shc']=0
      print(shcSurveyNo,'No')  
  except:
    print("Something went wrong!!!!!!!!")
#

what I'm missing here, please hel[

#

help

#
65 Yes
38 No
185 Yes
396 Yes
373 Yes
#

but in df its only show 1

hard hound
#

Hey does anyone use any cloud service here for computing?

boreal summit
#

I have a DataFrame in which I'm trying to count the number of times a certain string exist in a particular column. All the methods I've tried didn't work out.

#

For instance, in a DataFrame, under the name column, I'm trying to find rows that contain the word 'Mega', and count the total number of times the word appears.

short heart
#

would training with many epochs, finding epoch with as less loss in the end, and limiting epochs to this amount be good?

hard hound
#

@short heart I dont know much but when I increase epochs it decrease my loss and increases accuracy

#

@boreal summit Hey would you tell me a way you tried?

#

did you try count()?

boreal summit
#

I already tried using **

**mm = data['name'].str.contains('mega')

Then I passed the Boolean above

#

Then I passed the Boolean to **data

#

It didn't work.

#

The logic didn't even work, so it didn't get to the count part.

#

I've also tried **str.find()

#

They seem to work online but not with what I'm doing ATM.

hard hound
#

could send a screenshot?

boreal summit
#

Okay

boreal summit
hard hound
#

Great

short heart
#

Say, if Ive got a really small loss (2.0239e-04), but result is pretty bad, is that underfitting?

fiery cipher
#

I have a question : can I use min max data normalization than use Z score normalization , in theory it would work well but I am not sure because I read it is recomanded to use only one normalization method

desert oar
#

It's "under" fitting in that the model isn't representing/learning enough of the variation in the data

desert oar
short heart
#

then, next question

#

is 6500 values enough to train

#

or 11000

desert oar
#

It depends entirely on the data and the model

#

There is no magic number

short heart
#

kind of stock price

lapis sequoia
#

11000

desert oar
#

How are you evaluating the model?

#

What is the model anyway?

short heart
#

lstm layers

desert oar
#

What kinds of features are there? Is it classification or regression? Etc etc

short heart
#

with batch normalization layers and relu in between

hoary wigeon
#

hey

desert oar
hoary wigeon
#

what is the case to drop the column with missing data ?

#

more than 90% missing value in that column ?

desert oar
#

There is no rule or magic number for that either

hoary wigeon
#

for what ?

desert oar
#

Why is the data missing? Why did you want to include that column in the first place?

hoary wigeon
#

like

#

i have 77% record with missing Age data

#

in dataset

desert oar
#

Well subjectively that sounds like it might not be useful. But I don't know the specifics of your situation. Maybe that column is necessary and you need to do some more work

hoary wigeon
#

its just for practise

#

it is about titanic

#

😆 .

desert oar
#

in that case, this is a great opportunity to practice being smart about missing data

hoary wigeon
#

someone told me when there is column with missing data over 90% just drop that column

desert oar
#

don't attempt to follow or even invent strict rules for discarding data

#

it always depends on the situation

#

i happen to know that in the titanic dataset, age is important

hoary wigeon
#

but what the use of column

#

when there is 90 data missing

desert oar
#

but you don't know that up front

hoary wigeon
#

so i must replace it with median

#

-_-

desert oar
#

that's for you to figure out. maybe you can infer the data from somewhere else

#

or maybe its missingness or lack of missingness is itself a feature

hoary wigeon
#

i know age matters there

desert oar
#

maybe you can infer broadly a range of values from other data, even if you don't know the exact value

#

or maybe you just drop it and see what happens 🙂

hoary wigeon
#

I have only option to replace it with median

#

mean is close to median

desert oar
#

you might want to look into the different kinds of missing data.. "missing completely at random", "missing at random", and "not missing at random" (MCAR, MAR, and NMAR)

noble drum
#

why is that your only option? pithink

velvet thorn
near cosmos
#

And then forget about the missing at random idea because it's always a terrible assumption 😉

fleet dove
#

say I have a robotic wheelchair controlled by eye movements, can I class the user as an actuator?

tidal bronze
#

what could be reasons for pandas groupby to output me rows with duplicate keys they are being rouped on?

When I try to remove it with .drop_duplicates() the problem persist but the rows are clearly the same for those keys.

shadow knot
#

hi, first timer here. i'd like to ask some general question regarding how you choose a machine learning algorithm to build a model, more specifically an image classification recognition problem

#

to my understanding, generally I would want to look at the data, judge its distribution, its features and go from there. But that answer seems too generalized and is there any format or "guidline" that i could follow?

tidal bronze
desert oar
shadow knot
#

i am executing an assignment in school and one of the criteria is to choose a number of base models and justify why I chose it over a dataset of 20,000 RGB images of size 27x27, essentially making its feature dimensionality up to 729 if im handling it by greyscale value of individual pixels

desert oar
#

whereas for "social science" data usually there is no right answer and you might need to try several things

#

how many do you have to choose, and do you have to justify all of them or just the "best" one?

#

what constitutes a "model" in this case? are imagenet and resnet considered different models for the purposes of this assignment?

shadow knot
# desert oar how many do you have to choose, and do you have to justify all of them or just t...

dont have to choose a certain number, just have to justify why i chose it over other models.

I do have to justify all of them, but only to some degree. After hyper parameter tuning and feature selection, i am required to make an ultimate judgement and recommend a "best" one.

forgot to mention but I am not allowed to use pre-trained models, therefore i believe imagenet and resnet are out of bound. But if they werent, they would be considered 2 models.

#

right now i am going with Logistic Regression, Random Forest, CNN, SVM and KNN to cover all different "type" of algorithms

desert oar
#

i'd recommend looking into kernel SVM, specifically radial basis kernel, which as far as i know was very popular before deep learning took over

#

knn is an interesting choice, because you specifically need to define what how to define the "distance" between images

shadow knot
#

which brings me to my next question. Due to the number of features i have, i was thinking about dropping KNN since from what i've read, KNN effiency drops when you introduce a high dimensional dataset

cedar sun
#

can i use threads with a neural network?

shadow knot
#

the nature of the dataset is medical. the images are cell images and my two tasks consist of:

  1. Classify if it's a cancerous cell
  2. Classify if it's a specific type of cell

my original thought was i could use the "distance" metric for KNN instead of "uniform" since a cell of similar nature should have more "weight" when voting compared to a cell of completely different structure

desert oar
#

as far as i understand, pre-deep-learning image classification depended heavily on special-purpose feature engineering

#

so you could theoretically still use KNN if you could significantly reduce the size of the feature space

shadow knot
#

what's your thought on Logistic Regression?

#

regularization could help bring the size of the feature space down, which is my primary reason on selecting it as one of the base model

#

i will be implementing Bagging Random Forest as my feature selection technique so some feature removal will be done there but imo a model that could do that also is a plus right?

merry ridge
#

Taking the Fourier transform and keeping only a few of the terms with the largest Fourier coefficients seems like the first strategy that would come to mind for me

shadow knot
desert oar
#

yeah i would prefer something "intelligent" that reduces the feature space rather than somewhat-randomly discarding features

merry ridge
#

To me an image is just sound with a higher dimension

desert oar
#

people used to do all kinds of signal processing stuff for ML on images, and probably still do

shadow knot
merry ridge
#

I saw an example of this technique recently where they had multiple examples of biblical art with cherubs holding some kind of writing and used it to decode the text effectively

shadow knot
merry ridge
#

You’re not really discarding random features though. It is analogous to PCA discarding eigenvectors with the lowest eigenvalue. Unless you consider that also discarding those eigenvectors is also discarding random features which is certainly reasonable

shadow knot
#

what could be some of the performance metric that is generally good for this type of classification?

#

i know about the general one like accuracy, recall, precision and f1

grave frost
#

overall, F.T is worth experimenting - but seeing the lack of use in real-world (imaging BTW) doesn't seem like a promisable candidate, than say PCA

#

which is also easier and more commonly used 🤷

shadow knot
grave frost
#

sound can be represented as useless images spectral features, but images can't be represented as sounds, can they?

merry ridge
#

I’m just saying in a hand wavy way that many of the techniques used in dsp translate directly and nicely to image processing. I wouldn’t delve too deeply into a sound are pictures metaphor

grave frost
#

fair enough

desert oar
desert oar
#

you can also consider using a proper scoring rule like brier score, but neural networks tend to have very poor probability calculation

shadow knot
cedar sun
#

if i could have multiple threads predicting images

#

having the same model loaded

desert oar
#

sure, although if the underlying model prediction code is already multi-threaded then you don't want to start mixing in your own threading

#

also in python specifically multi-threading for computation doesn't work well due to something called the "global interpreter lock"

#

so in python you really need to use processes for parallel computations, threads are good for parallel/concurrent i/o but not cpu-bound computation

cedar sun
#

Mmmm

#

in english pls? xDDD

#

sorry i didnt understand. May i tell u my plan

#

and u tell me if it is doable?

desert oar
#

yes, it helps if you are more specific

inland zephyr
#

Hello i want to ask again about dummy dataset for face recognition using vector similiarities. Let said I have thousand vectors from thousand known person in my vector database. I have talk to someone outside that i need to add some dummy vectors with unkown class. But i still confused why i need to add some unknown dummy vectors to the known dataset? Is this for performance testing?

cedar sun
#

Okey. I downloaded a bunch of pokemon images. I found on github a model that is supposed to predict pokemon images. I wanna use this model to clean the images i downloaded. Can i use threading to increase the speed?

#

nvm, from 95 images the model i downloaded fails on 71

#

easy peasy

#

i guess i have to manually clean the data :D

#

gl hf

#

Is good for the training model passing this image as Bulbasaur Label?

#

no, right?

desert oar
cedar sun
#

i though about threads cuz

#

imagine i have the first 6 pokemons

#

if i have 3 threads,

desert oar
cedar sun
#

t1 checks pokemon[0], t2 pokemon[1] and t3 pokemon[2]. When they are done, t1 will move to pok[3], t2 to pok[4] and t3 [pok5]

#

threads for predicting, not training

desert oar
#

there are 2 problems:

  1. in python, 2 threads can't execute computations in parallel. this is a python-specific limitation.
  2. the underlying machine learning library might already be using multithreaded computations.
#

so you want to explicitly turn off multithreading, group your data into batches, and have processes making predictions on their own batches

cedar sun
#

mhmm okey, so no threads

grave frost
#

can you send me the repo?

cedar sun
#

it doesnt clean xd

#

i just though about using that already trained model to predict the images i downloaded xd

#

if the prediction matches the class where i downloaded the image, then it is a good download xD

grave frost
#

the probablity that a random model on github generalizes on general data is slim, but ehh

inland zephyr
cedar sun
#

this is the one i found

#

loss: 0.1279 - accuracy: 0.9743 - validation loss: 0.9940 - validation accuracy: 0.7917

#

But failing 71 out of 95 doesnt fit that accuracy

#

xDDDD

inland zephyr
#

so if I have 300 vector from 300 class image, i will put for example 600 dummy random vectors with unknown class. Anyway this is to calculate the performance, to know how much the vector search return unknown or wrong class. But i think it's can be work without the dummy, unless the vector of each class are pretty close distance

grave frost
#

didn't expect much either

inland zephyr
#

it's clearly overfit since the validation and train loss margin are too big (0.1 to 0.9)

cedar sun
grave frost
cedar sun
#

i cant

grave frost
#

then why are you saying it you are using? 🤔

#

how does inference has to do with data cleaning?

cedar sun
#

?

grave frost
#

that's not cleaning bruv

cedar sun
#

it is

#

imagine the model works 100%

grave frost
#

its more considered under general pre-processing

cedar sun
#

see the 081

grave frost
#

but anyways

cedar sun
#

it is not a bulbsaur

#

then model will say it is venosaur

#

but that img in on bulbasaur class

grave frost
#

don't really matter if the quantity of outliers is less

#

you can always compensate by robustness for the model

grave frost
cedar sun
#

mmm

#

so may i use this model and do transfer learning with it??

#

even i have some images that wont match the class?

desert oar
cedar sun
#

wait may i do?

desert oar
cedar sun
#

This is how many images i have per class

cedar sun
cedar sun
desert oar
#

you are asking, why does it seem like a good idea?

cedar sun
#

cuz even if that model sucks, it has already seen some pokemon images

grave frost
desert oar
#

depends on how many and how bad

grave frost
#

looks at cassava

cedar sun
#

;(

desert oar
cedar sun
#

oh

#

i read it doesnt seem

#

sorry, mb

icy python
#

after learning basic python, if you want to learn ML and AI, where would you start?

lapis sequoia
#

Tensorflow

#

NumPy and Scipy are famous

icy python
#

I've heard of numPy

lapis sequoia
#

you can even learn them

lapis sequoia
icy python
#

How should I learn them though

lapis sequoia
#

There are a lot of tutorials on youtube

#

and read documentation

icy python
#

Okay, thank you

lapis sequoia
#

🙂

#

apart from pandas, what other modules are useful for AI and data science?

desert oar
#

there is probably a good "data science with python" book

icy python
#

ok, i actually do need to learn basic python first i was asking for future reference but thank you!

desert oar
#

but it doesn't help you learn the stats or math

lapis sequoia
#

can it be a basic level of probability and statistics?

desert oar
desert oar
lapis sequoia
#

i mean, do you have to learn advanced probability and statistics

desert oar
#

yes, eventually

#

but it depends on what you mean by "advanced"

desert oar
#

do you need to fully understand the measure theoretic definition of probability? no

lapis sequoia
#

oh kk

desert oar
#

imagine a university graduate who got an A in calculus, probability, linear algebra, and statistics

#

if you have that level of training, you know enough

#

you don't need to learn all of it at once, of course

#

you should strive to gradually learn more of it over time

lapis sequoia
#

ah ok

polar stag
#

can someone recommend me a good book to start data science with python?

desert oar
#

@polar stag i just recommended two of them above

polar stag
#

got it. thanks

sharp reef
#

Is there a way to keep a jupyter notebook running while it's not actively opened in the browser and keep the output?

uncut barn
#
  def softmax(self):
    return np.exp(self.dataset) / np.sum(self.dataset)
#

does anyone know why the plot of my softmax looks wrong

#

the dataset only has 1 dimension

iron basalt
sharp reef
#

I have my jupyter notebook set up on a remote server

iron basalt
#

ssh into the server, run the file

sharp reef
#

at that point what's the point of the notebook then?

iron basalt
#

The point of notebook is exactly as the name implies, a notebook.

sharp reef
#

If I ssh I'll have to make sure to run it such that it doesn't stop when I close the ssh session as well..

iron basalt
#

It's useful for sharing things too.

sharp reef
#

Basically I need it to stay up so I can run experiments on azure over night

#

I'm not sure if it's possible to submit experiments in a queue

#

that would make it easier

iron basalt
#

What could be easier than making a script that does all this with one click?

#

(can tmux too)

sharp reef
#

I know about screen

#

it's mostly about things being harder to edit if I have to sftp them in and out every time..

iron basalt
#

So you want a remote editing tool?

#

That's just a text editor / IDE feature.

sharp reef
#

yeah but it's not remote

#

I spent way too much time getting the notebook to run remote in the first place which kinda feels wasted now

#

I figured it would be possible since colab can keep sessions alive when a notebook isn't open

iron basalt
#

So you have notebook running on a remote server with the --no-browser option?

sharp reef
#

yeah

#

I access it via the browser to edit and run notebooks

iron basalt
#

If you run some cell(s) it should just keep running.

#

Try running notebook server on your local machine: jupyter notebook --no-browser --port=8080, connect, make a new notebook, run an infinite loop that prints something, and exit. It should just keep running as long as the server is running.

sharp reef
#

hmm yeah it keeps running when I close it but when I open the notebook again it restarts the kernel (?)

iron basalt
desert oar
sharp reef
iron basalt
sharp reef
iron basalt
#

If you already ran something the output is still there.

sharp reef
#

hmm I'll have to try it again

iron basalt
#

If you want it to run while you are not connected, you need to download the python file which runs all the cells and do it manually.

#

But the manual is not too hard, just need to write a small script that uploads and runs it on a screen.

sharp reef
#

No it's definitely still running in the background

#

it shows as running and when I run a new code cell it queues it

#

so it can keep the process running in the background

#

It doesn't seem to be able to track which cell is running though

iron basalt
#

So when I ran a simple input echo loop that runs forever and came back the cell had finished running

sharp reef
#

and the output is ofc not kept in the cell output but I could log to a file instead

iron basalt
#

The previous input output echos were still there

sharp reef
#

the tab still shows the hour glass icon and trying to run another cell queues it instead of running it immediately

iron basalt
#

Idk when I came back to the notebook it was no longer asking for input which means the loop has stopped.

sharp reef
#

Yea I verify it, I ran a loop that sleeps for ~30 seconds, closed the tab, opened the notebook again and queued another code cell

#

it showed as queued and executed a few seconds later

#

so in principal it seems to work, it just doesn't recognize which code cell is currently running and stops printing the output

#

maybe that's good enough if I log to a file instead

grave frost
#

yea, I just check the tab logo to see if something is running\

#

but for output, logging is the simplest way

sharp reef
grave frost
sharp reef
#

I store epoch-level stats in the checkpoint itself

#

the logging would be more for azure stuff

steep rapids
#

Hey everyone, I'm trying to determine the average treatment effect for a problem I've been working through. To achieve this, I'm using dowhy + econML. When using the ForestDRLearner that is a part of these packages, the results I'm getting back for an average treatment effect are waaaaaay bigger than they should be. Does anyone know what could be causing this?

For example, range of outcome variable is [-10, 10], binary treatment, ATE is -50

grave frost
#

how much? down-payment? 😛

lapis sequoia
#

I want to learn AI in University, I need a guidance please Help me!

#

Bsc Artificial Intelligence

cinder lantern
#

hey i got some trouble with pandas, anybody out here?

#

trying to get EMA5 from ccxt api with pandas, but its giving me some diff return values

#
import ccxt
import pandas as pd

exchange = ccxt.binance({ 'enableRateLimit': True })
ohlcv = exchange.fetch_ohlcv(symbol = 'DOGE/USDT', timeframe='1h', limit=5)
data = map(lambda x: [ x[4]], ohlcv)
df = pd.DataFrame(data, columns = ['close'])
df['ewm'] = df['close'].ewm(span=5, min_periods=0, adjust=False, ignore_na=False).mean()
print(df)
desert oar
#

hm, that looks like the right usage to me. what were you expecting that you didn't get?

cinder lantern
#

lemme send some samples

desert oar
#

that'd be great

cinder lantern
#

output:
0 0.33208 0.332080
1 0.33461 0.332923
2 0.33247 0.332772
3 0.33088 0.332141
4 0.33400 0.332761

#

following some screenies

#

thats the 3rd index

#

id expect around 0.33354 on the second value

#

but it says 0.332141

#

just to be sure that the blue line indicates EMA5

#

i have update the whole script on my first message ^

desert oar
#

thanks for the updated code

#
                 close  close_ewm
timestamp                        
1622134800000  0.33208   0.332080
1622138400000  0.33461   0.332923
1622142000000  0.33247   0.332772
1622145600000  0.33088   0.332141
1622149200000  0.33494   0.333074

so this is what i got

cinder lantern
#

ye same as i did, thanks

desert oar
#

im not sure what im looking at with these screenshots

cinder lantern
#

lemme screen the whole area

desert oar
#

0.332141 should be 3.4something?

#

they might be doing something subtly different with their ewma calculation

cinder lantern
cinder lantern
# cinder lantern

this is the close price from that hour, which refers to the first column, index 3

desert oar
#

what does the (3,5) indicate?

cinder lantern
#

those are the 2 EMA indicators that i put on

#

EMA3 and EMA5

desert oar
#

oh, that's the value for the 3-period and 5-period versions

#

i see

#

are they using a different definition of "period"?

cinder lantern
#

yes EMA5 would be calculated over the last 5 periods of 1 hour

desert oar
#

what kinds of timestamps are these? i thought they were unix timestamps but then they're dates 3000 years in the future

cinder lantern
#

UTC timestamp in milliseconds, integer

desert oar
#

ahh

#

that makes more sense

cinder lantern
#

i tried it with timestamps, but the order seems right

#

i tried to reverse the array tho, but that shouldnt be

desert oar
#

btw here is how i loaded the data

import ccxt
import pandas as pd

exchange = ccxt.binance({ 'enableRateLimit': True })
ohlcv = exchange.fetch_ohlcv(symbol = 'DOGE/USDT', timeframe='1h', limit=5)
df = pd.DataFrame(ohlcv, columns = ['timestamp', 'open', 'high', 'low', 'close', 'volume'])
df.set_index('timestamp', inplace=True)
df.sort_index(inplace=True)

so yes they are definitely in ascending order

#
df['close_ewm'] = df['close'].ewm(span=5).mean()
print(df[['close', 'close_ewm']])

this gives me

                 close  close_ewm
timestamp                        
1622134800000  0.33208   0.332080
1622138400000  0.33461   0.333598
1622142000000  0.33247   0.333064
1622145600000  0.33088   0.332157
1622149200000  0.33494   0.333225

but i can't get the 2nd-to-last one to be something that rounds up to 0.34

limpid raven
#

imnot sure if this is the correct place but hi guys, im trying to plot my own trendline onto this graph how do i do it

desert oar
limpid raven
#

the exponential line you see has been converted into a linear line which has the value of 0.6559.. etc and i want to plot it over it, so so i could see the two overlapping

limpid raven
desert oar
#

so you want to plot a line with slope 0.6559? what is the y intercept, 0?

cinder lantern
desert oar
#

yeah @cinder lantern im not sure. maybe they are doing something slightly different with their calculation

cinder lantern
#

ey alr, might google further

#

thanks anyways tho

#

lookin at this for 2 hrs alr

limpid raven
#

it didnt work 😦 unless i did it wrong

#

i dont know if there is a y intercept, i got this line by using the straight line equation, y2-y1/x2-x1

cinder lantern
#

lol i got it

#

man im dumb af

#

@desert oar

#

we could have seen this from our testresults

#
     close      ema3      ema5
0  0.33876  0.338760  0.338760
1  0.34095  0.339855  0.339490
2  0.34117  0.340512  0.340050
3  0.33920  0.339856  0.339767
4  0.33584  0.337848  0.338458
#

see how 0 is the same while having a different span. its cuz it had no history to calculate with....

#

i only queried 5 periods, but for the first period, i have to query 4 more backwards in time

cedar sun
#

can i train a model on colab with files on my local machine?

desert oar
desert oar
iron basalt
#

%%capture output

#
Code doesn't stop on tab closes, but the output can no longer find the current browser session and loses data on how it's supposed to be displayed, causing it to throw out all new output received until the code finishes that was running when the tab closed.
#

(Also why my input loop stopped, it could not fetch input anymore)

broken stratus
#

Does anyone know if there is a way to scrape fb data

south gull
#

sure there is

desert oar
#

@broken stratus discussion of scraping facebook would be against our server rules, sorry.

#

that's because it violates the facebook terms of use, and we are not allowed to help with that

vital apex
# limpid raven

You are just plotting a,b and these are points. If you want a line calculate trendline = a*x + b and plt.plot(x,trendline) .

torpid ember
#

hey guys i need to manually map specific values with another value. Its like 55 records with specifically different (no logic) mapped valued. i have two alternatives im thinking about:

  1. make a dictionary with 55 values and associated values, or
  2. just put them in two columns and match them up via a .csv or .xlsx file.

Just wondering if theres a good practice for stuff like this? What would you recommend?

opaque stratus
#

Hey

#

I am training a Keras model

#

a CNN for sentence classification

#

TensorFlow tells me my GPU is available, but how can I discretely see if the GPU is being utilized during training?

opaque stratus
#

fixed!

opaque stratus
#

!code

arctic wedgeBOT
#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

sleek otter
#

the select query keeps on giving error 1241. Operand should contain 1 column. Would anyone here know where the problem might be?

ripe forge
inland zephyr
#

guys does anyone know an python project about face classification from photo which one real face or it was fake (taken using 2nd phone/monitor) using any kind of method as reference?

short heart
#

Whats the purpose of Dense layer?

inland zephyr
#

to wrap up the entire convolutions layers

worthy bear
#

help me please....

#

i tried all tricks..

#

the fig size is not increasing

#

heeeloooooooo

#

please advice...

lapis sequoia
#

maybe try making it horizontal?

low hornet
#

Oh I see, your graph is tiny and no, I don't know how to make it bigger, sorry

grave frost
#

So...I was thinking about gradient descent

#

suppose we have a simple equation where the variables are the weights for the network, and the equation is the loss function. so we would basically want to locate argmin

#

but instead of using SGD the whole time, why can't graph it?

so like we take n samples of different random weights - and we visually graph it (not the TF graph). we store it in a data structure, say the weights in one column and the output loss in the other. As we store the values in the data strucuture, we build a visual graph of it as we go.

now, after we try n different combinations of weights, we see where all the local minimas in the visual graph lie.
(Obviosuly we won't compute it for the whole domain, only certain number of specific values. )
lets call n ---> resolution of the graph. Thus, with a decent enough resolution, we can atleast guess where the global minima might be.

Thus, we take the guess of the weights that might correspond to a minima, and then we do SGD on it. so basically to initialize the weights and biases closer to a guess of a global minima.

on the graphs, mathematically we can calculate minima if suppose we have a 3-D loss plane. then a point where surrounding points would be greater than that point, would be the local minima

we do this a few times (which would take milliseconds) and then we would have a quite good initialization for the weights of the NN.

why don't we do this?

winged stratus
# grave frost but instead of using SGD the whole time, why can't graph it? so like we take `n...

something very similar to this is what metahueristics like simulated annealing do - they randomly take some weights and explore multiple minimas and hope to find a global minima. in theory this sounds good but in practice metahueristics suck at training neural networks.

also, finding the global minima isn't necessarilly the best thing, it could be overfit to that data. a sufficient local minima that generalizes well is enough

ripe forge
#

Simply because it won't be as simple and as you assume. These functions can get gnarly. And we don't really use sgd as is, we use it to teach sure, but we usually use some clever tricks on top of sgd (look up Adam or adaboost)

winged stratus
#

as Darr said, randomly selecting weights and letting sgd do its work isn't very efficient. the loss space is so huge that 99.99% of the time it would be better to start from a single random weight let it train for longer

late shell
#

If my data only has 1 feature, is feature scaling still required?

late shell
#

😔 But I though scaling makes sense when there are atleast 2 features out of scale, can you give me an explanation as to why is it needed with just 1 feature?

teal nova
#

can anyone explain mcmc

#

i get the monte carlo part, i also know what markov chains are, but i dont get how u put them together and how it works

#

like markov chains of parameters? what does that even mean

grave frost
#

but if you are doing linear regression, then it doesn't matter

lament stag
#

Can you help me? Why does the accuracy remain constant in the epoch results in the cnn model?

desert oar
# teal nova can anyone explain mcmc

The super tldr: there are ways to construct a markov chain such that the equiliibrium distribution of the markov chain is a particular probability distribution. The really cool (and useful) part is that you can do this without knowing the exact analytical form of the distribution function. This enables us to fit and sample from complicated Bayesian models for which computing the exact form of the distribution function (especially the normalizing constant) would be intractable or impossible.

#

This general category of algorithms is called "Markov chain Monte Carlo". Typical MCMC algorithms include Metropolis-Hastings, Gibbs Sampling, Hamiltonian Monte Carlo, and the No U-Turn Sampler.

desert oar
#

however there could be other issues here, it looks like the legend is very big but the main plotting axis is not

cedar sun
#

do jpg files of mxn pixels have the same quality as if it was png?

soft viper
#

any good paper for image processing?

tidal bough
cedar sun
#

so answer is no?

tidal bough
#

JPEG quality depends on the settings - the higher the compression, the more it butchers the image

#

if quality is important, use PNG

cedar sun
#

well, my pokemon data set was full on png format. it was 4GB i guess. now i changed so that images with 3 channels are saved as jpg. size reduced by half

#

but idk if a png of 3 channels has the same quality of a jpg :D

#

I mean, i did this cuz i need to upload the dataset to drive :(

tidal bough
#

nah, JPEG can compress better because it introduces artifacts

cedar sun
#

isnt jpg = jpeg?

tidal bough
#

same, yes

cedar sun
#

ok ok

tidal bough
#

I mean better than PNG

#

you can play around with saving images to JPEG, and comparing them with the originals

#

here's an example

cedar sun
#

anyway, now that u mentioned artifacts, i guess is good having artifacts on ur dataset, so model trains better

#

o.O

#

ah! also... i have another question

tidal bough
# cedar sun anyway, now that u mentioned _artifacts_, i guess is good having _artifacts_ on ...

A compression artifact (or artefact) is a noticeable distortion of media (including images, audio, and video) caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available...

cedar sun
#

sometimes... when u convert png to jpg, background has weird things... and my model works with 3 channels

#

so those shiity backgrounds... may i preprocess them?

tidal bough
#

not sure what you mean by weird things

cedar sun
#

wait

desert oar
cedar sun
#

no

#

i dont mean that

#

discord can open png, so it looks like this

#

if i open this image as RGB

#

no alpha

#

it looks like this

#

it is because the background has color, but since alpha channel is there, it isnt being painted

desert oar
#

yep, (255, 255, 0, 0) looks the same as (255, 0, 255, 0) because they're both fully transparent

cedar sun
#

so should i do some kind of preprocessing like if alpha is 0 then paint the pixel full white

#

or something?

desert oar
#

yeah i was about to suggest that

cedar sun
#

okey

#

dammit

tidal bough
#

Yeah, you need to blend the alpha-channel into the image, since JPEG doesn't support RGBA

#
from PIL import Image
import io, numpy as np
fake_file = io.BytesIO()
img = Image.open(r"D:\Programming\1200px-Typescript_logo_2020.png")
img = img.convert("RGB")
img.save(fake_file,format="jpeg",quality=10)
fake_file.seek(0)
img2 = Image.open(fake_file)
# calculate difference:
arr1 = np.array(img)
arr2 = np.array(img2)
diff = np.abs(arr1.astype(np.int32)-arr2.astype(np.int32)).sum(axis=2)
diff = diff*255/np.max(diff)
diff = diff.astype(np.uint8)

diff_img = Image.fromarray(diff)
diff_img.show()

here's an example of compression artifacts

#

original

desert oar
#

@tidal bough is there an "intelligent" way to do the alpha channel blending? i assume hard coding to white could cause problems if there are light colored or white objects in the image

tidal bough
#

difference with result

cedar sun
cedar sun
tidal bough
#

this is a bad example because this is a very simple image and can be compressed well, but you can see that at the edge of letters and at the rounded edges, there are differences between original and compressed

cedar sun
#

ok ok, i see

tidal bough
desert oar
cedar sun
#

the edge of the hair is gonna be black

#

look, this is what ive done in the past

#
def black_white(image):
    return np.where(image[:, :, 3] == 0, 255, 0).astype('uint8')```
#

eevee has hair tho

#

i think u mean hair could have some transparency, but some != totally transparent

#

So i only care about the pixels with alpha value == 0

#

maybe the eevee pixels have transparency 100, or maybe 85, or 1. i dont care. What i know is alpha = 0 -> no pokemon

#

but if u dont trust this 100%, an intelligent way is using a neural network that extracts the object on the picture and its mask :)

#

salient object detection or something like this that is this called

#

there is a model called u2net

fast dune
#

I know I’m butting into the conversation. I recently did image processing for a class (not AI; just regular image processing).

#

Always use PNG for processing. If the user input file is a JPEG, convert it to PNG before operating on it.

#

As for alpha channel problem (Bulbasaur example) you will need to replace the color of the image background with the platform background color.

#

Thankfully, the image background is usually one solid color (white or close to white).

tidal bough
#

oh, there exist default background colors for platforms?

#

how'd you get that?

cedar sun
#

what is a platform?

#

framework?

fast dune
#

No, platform like your GUI program. Discord dark mode uses gray.

cedar sun
#

ah

rose cipher
#

Hey guys, I would like to start in DS and ML, but I am not good at math. What topics of math is used in DS and ML?

#

Do you guys have some material to help me with math?

cedar sun
#

yeah but, for the bulbasaur above, if i pass it to my model for training, model will see green tones on the background

#

cuz it will remove the alpha

tidal bough
#

for ML, a ton of linear algebra and some basic calculus (derivatives, etc)

#

for DS, well, statistics and the probability theory required for it

rose cipher
#

Can you guys recommend me some books to learn math?

#

One last thing. Is geometry used in ML? I just HATE GEOMETRY

tidal bough
#

I mean, not really, unless you count linear algebra as such

fast dune
#

@cedar sun Unfortunately I didn’t do it with ML so I don’t know that answer. Just remember that an image of size 400x500 is always 400x500 unless you physically crop it. Which you don’t because that’s annoyingly hard. Therefore, every pixel in that dimension needs a numerical value.

rose cipher
#

I hate the fact that I love computer world but I have to learn math to understand

tidal bough
#

(heh, I mostly remember disliking geometry in high school because it was too... nonobvious - like, you had to find a way to solve a problem, what equations to write, as opposed to just doing some mindless math like in algebra or even physics)

short heart
#

does anybody know by chance whats the best accuracy anybody has ever gotten in predicting forex/stocks?

rose cipher
#

Do you guys think that Khan Academy is a good place do learn math?

tidal bough
#

linear algebra of the kind you'll be using in ML isn't really geometry-related

tidal bough
rose cipher
#

but it´s not the best right?

cedar sun
#

i think ML needs more about calculus than algebra tho

#

back propagation is basically the heart of ML

#

and thats calculus