#data-science-and-ml | Python | Page 202

sand reef Jun 15, 2019, 11:35 AM

#

what he means is that he is making a 2D matrix with a 100 columns

#

the np.random thing

#

well, @prisma verge i'll try to see what i can do

prisma verge Jun 15, 2019, 11:36 AM

#

thank you very much for help! 💜

lapis sequoia Jun 15, 2019, 11:36 AM

#

he uses the term dimension for everything so that kinda confused me

sand reef Jun 15, 2019, 11:36 AM

#

yeah, i too try not to get confused on that

prisma verge Jun 15, 2019, 11:37 AM

#

arrays are confusing at themselves

sand reef Jun 15, 2019, 11:37 AM

#

what he meant is that if you were to plot it on a graph, the graph would be 100 dimensions

#

meaning 100 features

lapis sequoia Jun 15, 2019, 11:37 AM

#

oh

#

so input_dim takes the number of columns in an array?

prisma verge Jun 15, 2019, 11:37 AM

#

number of dimensions = number of features?

sand reef Jun 15, 2019, 11:37 AM

#

not really

#

input_dim means what is the shape of array that you are passing into the function

prisma verge Jun 15, 2019, 11:38 AM

#

huh, then it's worth it to always type in input_dim = array.shape

sand reef Jun 15, 2019, 11:38 AM

#

and in ML generally we represent the stuff to learn in a 2D matrix

lapis sequoia Jun 15, 2019, 11:38 AM

#

yeah so we can just do input_dim = array.shape ?

sand reef Jun 15, 2019, 11:39 AM

#

exactly, but instead you say, input_dim = array.shape[1:]

#

because the first dimension is generally the number of examples

#

for a dataset

prisma verge Jun 15, 2019, 11:39 AM

#

ah

#

now that makes perfect sense

lapis sequoia Jun 15, 2019, 11:39 AM

#

so if the shape is (10, 5)
input_dim = 5
?

prisma verge Jun 15, 2019, 11:39 AM

#

(not for my example, but for ml)

sand reef Jun 15, 2019, 11:40 AM

#

yes, if you are passing it one by one

#

if not, then you can pass the whole thing in

lapis sequoia Jun 15, 2019, 11:40 AM

#

then what if the shape is (10, 5, 5)?

sand reef Jun 15, 2019, 11:40 AM

#

means, 10 examples, who have 25 features (5x5)

#

or 25 pixels if those are 5x5 images

lapis sequoia Jun 15, 2019, 11:41 AM

#

so input_dim = 25?

prisma verge Jun 15, 2019, 11:43 AM

#

that looks simple now for shapes since i was always confused by stuff like "oh god what shape should i pass into there and how much neurons there should be"

sand reef Jun 15, 2019, 11:44 AM

#

no, the inut dim is (5,5)

prisma verge Jun 15, 2019, 11:44 AM

#

huh
so, input_dim always = array.shape[1:]

#

even if it's d100

#

as it seems at least

sand reef Jun 15, 2019, 11:45 AM

#

yep

#

since we dont mention the batch size

#

@lapis sequoia read this

#

https://keras.io/getting-started/sequential-model-guide/

#

so @prisma verge what are we trying to predict?

lapis sequoia Jun 15, 2019, 11:47 AM

#

okay thanks

sand reef Jun 15, 2019, 11:47 AM

#

because I dont see any labels

#

unless we have to generate the labels ourselves

prisma verge Jun 15, 2019, 11:51 AM

#

we gotta predict which values are gonna be the next

#

like,
49, 3, 1, 2

sand reef Jun 15, 2019, 11:51 AM

#

so, from the csv i read and the pic what you sent....this seems like a regression problem

prisma verge Jun 15, 2019, 11:52 AM

#

49 = round
3, 1, 2 = team results

sand reef Jun 15, 2019, 11:52 AM

#

and definitely not classification

prisma verge Jun 15, 2019, 11:53 AM

#

That regression is the problem of predicting a continuous quantity output for an example.

#

seems like it

#

it's like predicting results for tv show actually, based on previous results

sand reef Jun 15, 2019, 11:53 AM

#

yeah, since we are not classifying anything

prisma verge Jun 15, 2019, 11:53 AM

#

yes, exactly

#

just i've only built a bit of image classification networks and finetuning one about text generation

#

so i don't know how to handle this situation

#

and it'd be probably quite useful experience for my knowledge of ml

sand reef Jun 15, 2019, 11:54 AM

#

yeah, if my net works T^T

lapis sequoia Jun 15, 2019, 11:55 AM

#

one more question xD

sand reef Jun 15, 2019, 11:55 AM

#

sure

lapis sequoia Jun 15, 2019, 11:56 AM

#

from keras import Sequential
from keras.layers import Dense
import numpy as np

# For a single-input model with 2 classes (binary classification):

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=(200, 100)))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

data = np.random.random((1000, 200, 100))
labels = np.random.randint(2, size=(1000, 1))

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)```
just wanted to tweak around with input dim to make my self understand
i get this error:-

#

https://hastebin.com/ulewavuwus.sql

#

but isnt input_dim = data.shape[1:]??

sand reef Jun 15, 2019, 11:59 AM

#

well, in this case its a dense layer right?

#

dense layers, well need to flatten the input

lapis sequoia Jun 15, 2019, 11:59 AM

#

so dense layers dont support 3d?

sand reef Jun 15, 2019, 11:59 AM

#

nope, you will have to flatten it, what you can do is add a flatten layer before the dense layer

prisma verge Jun 15, 2019, 12:00 PM

#

the problem with predicting the data imo is that there's not enough of dataset

#

just 47 rounds which is really not a lot for network

lapis sequoia Jun 15, 2019, 12:00 PM

#

like this?
Flatten(input_shape=(200, 100))
?

sand reef Jun 15, 2019, 12:00 PM

#

remove the 1000

lapis sequoia Jun 15, 2019, 12:01 PM

#

its input_shape u sure i gotta remove 1000?

sand reef Jun 15, 2019, 12:01 PM

#

yus

#

the way you are asking is like its a life and death situation

#

i am scared now xD

lapis sequoia Jun 15, 2019, 12:01 PM

#

lmao xD

#

so the first element in the tuple is always the number of examples so i gotta remove it?

sand reef Jun 15, 2019, 12:02 PM

#

@prisma verge i too am confused......we dont have a value for 'y' instead we are just predicting x

#

yus

lapis sequoia Jun 15, 2019, 12:03 PM

#

that thing above can also be:-
Flatten(input_dim=(200, 100))
?

sand reef Jun 15, 2019, 12:03 PM

#

oh okay, we can use the number of days and use multiple linear regression

#

@lapis sequoia yus

#

no wait, there might be a small issue

lapis sequoia Jun 15, 2019, 12:04 PM

#

now i wonder whats the difference between input_dim and input_shape xD

sand reef Jun 15, 2019, 12:04 PM

#

be careful, there is an apparant subtle difference between them

prisma verge Jun 15, 2019, 12:04 PM

#

also, i've asked too much probably already, but may you comment the code for better understanding?
i'm aiming for understanding more than result :p

sand reef Jun 15, 2019, 12:04 PM

#

sure, let me see what i can come up with it in it

prisma verge Jun 15, 2019, 12:04 PM

#

also, really thank you for all the help with this fuss

lapis sequoia Jun 15, 2019, 12:05 PM

#

cool i didnt get an error thanks man

sand reef Jun 15, 2019, 12:08 PM

#

np

#

so, @prisma verge we will have to use linear regression where X will be the day number and Y will be the one of the features

prisma verge Jun 15, 2019, 12:09 PM

#

huh, why one of the features though if there's three besides the days?

sand reef Jun 15, 2019, 12:10 PM

#

exactly, we will predict each feature but by using the number of days

#

one by one

prisma verge Jun 15, 2019, 12:10 PM

#

ah

sand reef Jun 15, 2019, 12:11 PM

#

gimme a little bit of time, i'll tag you if i manage to do it

prisma verge Jun 15, 2019, 12:13 PM

#

that'd be really cool!

#

also, if this prediction model won't lie, i'll give you something from the wins, haha

sand reef Jun 15, 2019, 12:13 PM

#

xD

#

well, i got 32% accuracy xD

#

*34%

#

so, i got the accuracy up to 43.75

#

got it up to 50 xD

prisma verge Jun 15, 2019, 12:36 PM

#

how much epochs?

sand reef Jun 15, 2019, 12:37 PM

#

welp, here is the thing @prisma verge

#

it turns out that there is something called Multivariate Multiple Regression

#

that is used to capture the relation between the multiple outputs

#

we might need that

prisma verge Jun 15, 2019, 12:38 PM

#

but there's no that thing in keras
and there's very little info about it except universities and other very scientific resources :p

sand reef Jun 15, 2019, 12:38 PM

#

and I dont think keras or sklearn has that

prisma verge Jun 15, 2019, 12:38 PM

#

aw man

#

that's sad

#

https://www.datacamp.com/community/tutorials/keras-r-deep-learning
this seems to be the thing

DataCamp Community

keras: Deep Learning in R

In this tutorial to deep learning in R with RStudio's keras package, you'll learn how to build a Multi-Layer Perceptron (MLP).

#

because it was linked in this question https://www.researchgate.net/post/Implement_Multivariate_Regression_by_Neural_Network_with_Tensorflow

ResearchGate

Implement Multivariate Regression by Neural Network with Tensorflow?

Hi, now I'm facing new problem at the construction of neural network in tensotflow.

I would like to implement regression/classification problem at a same time in one neural network, which has two...

#

so, bunch of relu's

📎 unknown.png

sand reef Jun 15, 2019, 12:40 PM

#

yeah, its a weird one, i'll have to see how that works

#

and btw I ran with 500 epochs

prisma verge Jun 15, 2019, 12:41 PM

#

i guess, that'd be interesting thing for you but too advanced for beginner like me :p

sand reef Jun 15, 2019, 12:41 PM

#

thing is

prisma verge Jun 15, 2019, 12:41 PM

#

especially when i'm not that good at math

sand reef Jun 15, 2019, 12:41 PM

#

what we need is something to capture the relation between the outputs

prisma verge Jun 15, 2019, 12:41 PM

#

is multivariate regression the only way to do that?

sand reef Jun 15, 2019, 12:41 PM

#

because separately, we are getting the outputs for each of them

#

but they are not like 1, 2 and 3

#

multivariate multiple regression

#

multivariate means with multiple inputs

prisma verge Jun 15, 2019, 12:42 PM

#

multiple means multiple outputs?

#

huh

sand reef Jun 15, 2019, 12:42 PM

#

but multivariate multiple means the outputs are also considered

prisma verge Jun 15, 2019, 12:42 PM

#

... i'm not really sure how to handle that then

#

since it'd be no use for me since it's just too hard for my basic understanding, and it'd be probably a lot of sucking power from you

sand reef Jun 15, 2019, 12:43 PM

#

i am thinking of doing something really weird tho

prisma verge Jun 15, 2019, 12:43 PM

#

i guess, 30 rounds heists-streak is my dream and will stay that :p

sand reef Jun 15, 2019, 12:43 PM

#

like, train a model on the number of days

#

and then train that model on other inputs

#

its like linking multiple regressions together

prisma verge Jun 15, 2019, 12:44 PM

#

hmmm
hey, that may actually work

sand reef Jun 15, 2019, 12:44 PM

#

lets try that

prisma verge Jun 15, 2019, 12:45 PM

#

like, instead of making whole sandwich, you make sandwich part by part

#

but it's still sandwich :p

sand reef Jun 15, 2019, 12:45 PM

#

yup

#

fk, i ended up with a 1000 epochs lol, gotta tune it down

prisma verge Jun 15, 2019, 12:48 PM

#

woah, that's a lot

sand reef Jun 15, 2019, 12:54 PM

#

welp that didnt work

#

i used the previous model to predict the new models training values and used the old labels

#

well, the accuracy fell

prisma verge Jun 15, 2019, 12:54 PM

#

... whoops

sand reef Jun 15, 2019, 12:55 PM

#

yeah, i think i see the issue

prisma verge Jun 15, 2019, 12:55 PM

#

seems like Life (TM) logic doesn't work in ML world :p

sand reef Jun 15, 2019, 12:55 PM

#

its that, we dont have enough features

#

we only have one feature to predict stuff off

#

so, its not going to capture the output

prisma verge Jun 15, 2019, 12:55 PM

#

... so, the stuff with multiple regression is needed

sand reef Jun 15, 2019, 12:56 PM

#

yeah

prisma verge Jun 15, 2019, 12:56 PM

#

ah man, why everything nice in this world is so hard :p

sand reef Jun 15, 2019, 12:56 PM

#

or i might be running too many layers

#

apparantely if your network becomes too deep, the accuracy starts falling

prisma verge Jun 15, 2019, 12:57 PM

#

how much layers do you have actually?

#

3 hidden should be enough for every feature, no?

sand reef Jun 15, 2019, 12:58 PM

#

2 hidden in first, 1 in second

prisma verge Jun 15, 2019, 12:58 PM

#

huh

#

should be enough, no?

#

weird then

sand reef Jun 15, 2019, 12:59 PM

#

yep, i m trying to change teh activation functions now

#

accuracy went up to 43.75

prisma verge Jun 15, 2019, 1:00 PM

#

but it went to 50 in the past, so it's actually lower than some previous variant

#

hm

sand reef Jun 15, 2019, 1:01 PM

#

yeah, but for some reason the validation accuracy crashes when training hits 50

#

i think that overfits it in the middle

prisma verge Jun 15, 2019, 1:02 PM

#

dropout layer needed, i guess?

#

i mean, dropout usually helps with overfits

sand reef Jun 15, 2019, 1:02 PM

#

well, there is barely anything to dropout

#

that works when your training accuracy hits like 90 and all but validation is bad

#

here our training itself is bad

prisma verge Jun 15, 2019, 1:03 PM

#

hm

sand reef Jun 15, 2019, 1:03 PM

#

from sklearn.model_selection import train_test_split
from sklearn import preprocessing
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.utils import to_categorical

df = pd.read_csv('/content/gdrive/My Drive/m.csv')
dataset = df.values
X = dataset[1:, 0]
Y = dataset[1:, 1:]
#Y = dataset[1:, :]
'''min_max_scaler = preprocessing.MinMaxScaler()
X_scale = min_max_scaler.fit_transform(X)
print(X_scale)'''
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X, Y, test_size=0.3)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)

model1 = Sequential([Dense(1, activation='relu', input_shape=(1,)), 
                    Dense(32, activation='relu'),
                    Dense(32, activation='relu'),
                    Dense(3, activation='tanh')])
model1.compile(optimizer='adam',
              loss='mean_squared_error',
              metrics=['accuracy'])
hist = model1.fit(X_train, Y_train,
                 batch_size=1, 
                 epochs=100,          
                 validation_data=(X_val, Y_val))
predictions = model1.predict(X)
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(predictions, Y, test_size=0.3)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
model = Sequential([Dense(3, activation='relu', input_shape=(3,)), 
                    Dense(32, activation='relu'),
                    Dense(3)])
model.compile(optimizer='adam',
              loss='mean_squared_error',
              metrics=['accuracy'])
hist = model.fit(X_train, Y_train,
                 batch_size=1, 
                 epochs=100,          
                 validation_data=(X_val, Y_val))
print(model.predict(model1.predict([50])))```

prisma verge Jun 15, 2019, 1:04 PM

#

so, it can't achieve bigger 50% without math, stats, and tensorflow?

sand reef Jun 15, 2019, 1:04 PM

#

not that I know of

prisma verge Jun 15, 2019, 1:05 PM

#

aw

sand reef Jun 15, 2019, 1:05 PM

#

well, I mean, thats the limit of my knowledge right there

#

I don't know what could increase it.

prisma verge Jun 15, 2019, 1:05 PM

#

not like my knowledge is very better :p

sand reef Jun 15, 2019, 1:05 PM

#

Because as much as I have learnt

#

the only way to increase training accuracy is

prisma verge Jun 15, 2019, 1:05 PM

#

📎 unknown.png

#

so, the least is the best team in eyes of nn?

#

:p

sand reef Jun 15, 2019, 1:06 PM

#

more features, more examples, or sometimes better model

#

yes

#

if you want, you can do one thing

prisma verge Jun 15, 2019, 1:06 PM

#

let's see how this heist goes and if predictions are bad or good

sand reef Jun 15, 2019, 1:06 PM

#

use this function

#

in the end after modeling

prisma verge Jun 15, 2019, 1:07 PM

#

though, after second run the results are very different

📎 unknown.png

sand reef Jun 15, 2019, 1:07 PM

#

yes

#

every time the initial values are initialized randomly

prisma verge Jun 15, 2019, 1:07 PM

#

we need to summon ML specialists!

sand reef Jun 15, 2019, 1:07 PM

#

so, the results will be different

#

i'll give you a function, to take the values and predict them right

#

gimme a sec, i need to check something lol

lapis sequoia Jun 15, 2019, 1:09 PM

#

what does an activation do?

sand reef Jun 15, 2019, 1:09 PM

#

values after being multiplied by the matrix are passed through a function

#

that function is activation and it introduces non linearity

prisma verge Jun 15, 2019, 1:10 PM

#

ml shows best results until epoch 500 hits

#

then loss chanegs veeeery slowly

#

changes*

lapis sequoia Jun 15, 2019, 1:10 PM

#

should i choose activations based on the situation?

sand reef Jun 15, 2019, 1:11 PM

#

sometimes they do matter, sometimes they dont

prisma verge Jun 15, 2019, 1:11 PM

#

so, what's that function I wonder?

lapis sequoia Jun 15, 2019, 1:11 PM

#

so i can give relu for now?

sand reef Jun 15, 2019, 1:15 PM

#

here

#

def correct(list_values):
  temp = np.argsort(list_values)
  names = ['Micheal', 'Franklin', 'Trevor']
  return [names[x] for x in temp[0]]```

#

pass in the prediction we made

#

and yes @lapis sequoia you may do so

prisma verge Jun 15, 2019, 1:16 PM

#

instead of model.predict[50]?

lapis sequoia Jun 15, 2019, 1:16 PM

#

Thanks bro i luv you [no homo]

prisma verge Jun 15, 2019, 1:16 PM

#

ah

#

with model.predict[50

#

gotcha

sand reef Jun 15, 2019, 1:16 PM

#

yeah, the weird print() at the end right? pass its parameter into this function

#

and np!

#

works?

prisma verge Jun 15, 2019, 1:17 PM

#

it goes through epochs

sand reef Jun 15, 2019, 1:18 PM

#

well, thats the model lol

#

well, i dont know why am i feeling happy even tho it didnt exactly work out

prisma verge Jun 15, 2019, 1:18 PM

#

because it was interesting task?

#

:p

sand reef Jun 15, 2019, 1:18 PM

#

i guess?

#

so, the function worked?

prisma verge Jun 15, 2019, 1:19 PM

#

it still goes throug epochs

sand reef Jun 15, 2019, 1:19 PM

#

where did you get this csv from tho?

prisma verge Jun 15, 2019, 1:20 PM

#

i made it in google tables by copying it from the site

sand reef Jun 15, 2019, 1:20 PM

#

oh? what site was it?

prisma verge Jun 15, 2019, 1:20 PM

#

https://xyzgiveaways.com/

sand reef Jun 15, 2019, 1:20 PM

#

lol reddit only

prisma verge Jun 15, 2019, 1:20 PM

#

oh, i didn't import numpy

#

imported, trying again now

sand reef Jun 15, 2019, 1:21 PM

#

lol

prisma verge Jun 15, 2019, 1:21 PM

#

but yeah, this guy does reddit giveaways which has 2/3 chances to win

sand reef Jun 15, 2019, 1:21 PM

#

i see

prisma verge Jun 15, 2019, 1:21 PM

#

you gotta have 9 streak to get 5 euros steam giftcard

#

and chance of it working is 2.3%

#

so, since i'm interested in ml, i thought "hey that'd be nice to predict with machine capabilities!"

#

but then i got confused with what should i do

#

so, uhhh

📎 unknown.png

sand reef Jun 15, 2019, 1:23 PM

#

wat?

prisma verge Jun 15, 2019, 1:23 PM

#

that's what it outputs

sand reef Jun 15, 2019, 1:24 PM

#

📎 snipsnip.PNG

#

show me the full thing, the one with the 4 frames and upper AttributeError

prisma verge Jun 15, 2019, 1:25 PM

#

📎 unknown.png

#

📎 unknown.png

#

OHH

#

NEVERMIND

#

I CAPTURED JUST ONE PREDICT BUT NOT TWO

sand reef Jun 15, 2019, 1:26 PM

#

lol

prisma verge Jun 15, 2019, 1:28 PM

#

accuracy now doesn't bounce

sand reef Jun 15, 2019, 1:29 PM

#

so, what does this guy do?

prisma verge Jun 15, 2019, 1:29 PM

#

it shows stuff in order "biggest chance > middle chance > lowest chance"?

sand reef Jun 15, 2019, 1:29 PM

#

yeah thats what the function does

prisma verge Jun 15, 2019, 1:30 PM

#

this guy does a giveaway series where you get more points the more your streak is

#

streak = heists without getting onto biggest team

sand reef Jun 15, 2019, 1:30 PM

#

so this time, it was to make a predictive model?

prisma verge Jun 15, 2019, 1:30 PM

#

for points you can "buy" steam games and gift cards

#

yeah

sand reef Jun 15, 2019, 1:30 PM

#

well, you can possible get as far with this model as far you can without a model lol

prisma verge Jun 15, 2019, 1:30 PM

#

this ai is always bouncing within it's worth to choose trevor or michael

sand reef Jun 15, 2019, 1:30 PM

#

*possibly

prisma verge Jun 15, 2019, 1:31 PM

#

but always hates on franklin

#

:p

sand reef Jun 15, 2019, 1:31 PM

#

maybe Franklin doesnt get in the biggest team a lot

prisma verge Jun 15, 2019, 1:31 PM

#

nah, the least one with biggest teams is Michael

#

then goes trevor, then franklin

sand reef Jun 15, 2019, 1:32 PM

#

xD, so, then i guess we are better off without a model xD

#

just make a random number generator

prisma verge Jun 15, 2019, 1:32 PM

#

nah, it's your task to go on the team with least numbers or at least middle team

#

biggest team = arrest

#

i just thought that with all prediction stuff it's actually possible to do some predicts that have more than 20% chance of being true

sand reef Jun 15, 2019, 1:33 PM

#

nah boi

#

for that just use maths

#

you need a lot of data and features to do predictive modeling

prisma verge Jun 15, 2019, 1:34 PM

#

what maths should i do

#

🤔

sand reef Jun 15, 2019, 1:34 PM

#

probability

#

welp, i m off now!

#

ciao~

prisma verge Jun 15, 2019, 1:35 PM

#

good luck, and that was fun :p

spark nimbus Jun 15, 2019, 3:12 PM

#

@lapis sequoia moving here; Do you know of an efficient way to handle EQ?

lapis sequoia Jun 15, 2019, 3:15 PM

#

@spark nimbus what do you exactly mean? Like frequency bands?

spark nimbus Jun 15, 2019, 3:15 PM

#

yup

#

since this is the most taxing part of my current implementation

#

    def bands_to_eq_size(self, frame: numpy.array) -> numpy.array:
        frame *= self.eq
        return frame  / 1000

    def transform(self, frame: numpy.ndarray) -> numpy.ndarray:
        fftified = self.fft(frame.copy())
        eq_applied = self.bands_to_eq_size(fftified)
        return self.ifft(eq_applied)

    def process(self, audio: AudioSequence) -> AudioSequence:
        left, right = audio / 2

        new_left = []
        new_right = []

        for old, new in zip([left, right], [new_left, new_right]):
            for frame in old:
                new.append(frame.apply(self.transform, seq=True))

        return (
            left.new(
                numpy.concatenate(
                    [f.audio
                     for f in new_left])) *
            right.new(
                numpy.concatenate(
                    [f.audio
                     for f in new_right])))
```here's my current implementation

lapis sequoia Jun 15, 2019, 3:17 PM

#

Well I will work on that this evening. But I cant promise anything. Asking efficient way is quit relative.

#

Hmmm srry i dont have a pc right now. I am at work hahah

spark nimbus Jun 15, 2019, 3:17 PM

#

right now it's done frame-by-frame

lapis sequoia Jun 15, 2019, 3:18 PM

#

Hmmm well

#

My way is to divide frequecy spectrumnfor human hearing into bands

#

So say 0 hz to 22 khz can be static or a logaritmic function

spark nimbus Jun 15, 2019, 3:19 PM

#

wdym by that?

lapis sequoia Jun 15, 2019, 3:19 PM

#

The on each band you can do fft

spark nimbus Jun 15, 2019, 3:19 PM

#

hmm

lapis sequoia Jun 15, 2019, 3:20 PM

#

Well the thing is

spark nimbus Jun 15, 2019, 3:20 PM

#

sadly due to lack of knowledge in this area I have no clue what all that means

#

and there's no easy way to get into audio it seems

lapis sequoia Jun 15, 2019, 3:20 PM

#

Welll uhm

#

Yeah mine knowledge is also very low. It costs me a lot ofntime. But you have to firstly know what you want to achieve

#

Is it plotting? What do you need exactly? Because it can be peaks? Of a specific band plotted? Or you can have averages of bands

#

You really must tell exactly what you want because thayt will

spark nimbus Jun 15, 2019, 3:23 PM

#

I'm manipulating live audio

#

basically a DSP

lapis sequoia Jun 15, 2019, 3:23 PM

#

Yeah.....

#

Manipulating?

spark nimbus Jun 15, 2019, 3:23 PM

#

basically applying FX

lapis sequoia Jun 15, 2019, 3:23 PM

#

Ahaaaa

#

And you are creating a new audio feed? Ornfile?

spark nimbus Jun 15, 2019, 3:24 PM

#

passing straight to pyaudio

lapis sequoia Jun 15, 2019, 3:25 PM

#

So you have a wave file then you want to apply fx and send that to pyaudio?

#

Fx is quit complicated bro. You really nedd to understand basic fft and complex numbers to make a fx on it. So be honest i dont have a clue on how to do that. I know i can get a plot for freqs peaks and i know how to calculate averages on freqs bands but appyling fx is quit complicated

prisma verge Jun 15, 2019, 3:36 PM

#

@sand reef hey! what do you think, what will happen if you pass less data to the nn?

#

like, only last ten entries?

#

will the prediction become closer to "The Truth" (TM) because last entries are easier to judge from?

#

accuracy is 60% on latest entries btw

prisma verge Jun 15, 2019, 5:13 PM

#

@sand reef so, NN doesn't work because it never gave me a result like this lol

📎 unknown.png

sand reef Jun 15, 2019, 5:27 PM

#

Yup

#

And no less entries means, the output will be even more random

modest scarab Jun 15, 2019, 5:58 PM

#

hi there!

#

Hows everyone doing today?

#

I am part of the chamber of commerce comittee for my local Indian community in my city

#

i am thinking about creating a census from the indian community so i can understand the demographic and the analysis of the demographic

#

so my committee can understand how to better serve the indian community

#

it would be easy to send a newsletter to everyone with a google form and to work with other indian leaders in my community to make a concerted effort to gather information about what part of india they are from , their type of professions, their families and household, and age group, and area they lived. of course, this is not about asking private information but more a general information

#

do you have any suggestions or ideas of how I could improve this project idea?

sand reef Jun 15, 2019, 7:06 PM

#

How do you want to improve it? Like what are your limitations? And what kind of improvement are you looking for?

prisma verge Jun 15, 2019, 7:10 PM

#

it actually guessed the stuff right!

#

once

#

now i've reran it to see if it'll gonna guess it again

#

because i've added dropouts and third network to it

#

@sand reef imagine all of money we'll win if my way works!

sand reef Jun 15, 2019, 7:12 PM

#

Lol

prisma verge Jun 15, 2019, 7:13 PM

#

"your car doesn't work? just add another engine to it" (C)

sand reef Jun 15, 2019, 7:13 PM

#

You added a 3rd network like how I added the second one?

prisma verge Jun 15, 2019, 7:13 PM

#

yeah

sand reef Jun 15, 2019, 7:13 PM

#

🤣

#

Well, that's going to be a big network now

#

And congratulations!

prisma verge Jun 15, 2019, 7:13 PM

#

GOOGLE, LOOK, WE CAN PREDICT THE FUTURE NOW

#

FUND OUR STARTUP PLEASE

sand reef Jun 15, 2019, 7:14 PM

#

Well, how I wish they would.

prisma verge Jun 15, 2019, 7:14 PM

#

though this predict may have been an accident

sand reef Jun 15, 2019, 7:14 PM

#

Probably

prisma verge Jun 15, 2019, 7:14 PM

#

network is still learning to see if it wasn't

sand reef Jun 15, 2019, 7:15 PM

#

Ppl would say calculate a p value for it

prisma verge Jun 15, 2019, 7:15 PM

#

do i look like a mathematician?

sand reef Jun 15, 2019, 7:15 PM

#

But we have only 1 feature, wtf we gonna calculate?

prisma verge Jun 15, 2019, 7:16 PM

#

we'd have more if those keras guys implemented multiple regression thing

#

and we could win thousand of 5 euro gift cards...

sand reef Jun 15, 2019, 7:16 PM

#

We are making features out of one feature

#

Well. Imma go read the pdf I was reading.

#

You also read this.

prisma verge Jun 15, 2019, 7:17 PM

#

MAN

#

IT PREDICTED IT RIGHT

#

though it may have been reverse order because i don't remember what order it was

sand reef Jun 15, 2019, 7:18 PM

#

📎 ISLR_Seventh_Printing.pdf

prisma verge Jun 15, 2019, 7:18 PM

#

BUT STILL

#

IT MAKES SENSE

sand reef Jun 15, 2019, 7:18 PM

#

Read that and become a mathematician

prisma verge Jun 15, 2019, 7:19 PM

#

IT MADE "michael - trevor - franklin"
AND MICHAEL WAS BIGGEST TEAM, TREVOR WAS MIDDLE, FRANKLIN WAS SMALLEST

#

DOES IT ACTUALLY WORK???

sand reef Jun 15, 2019, 7:19 PM

#

Yep. That was reverse

prisma verge Jun 15, 2019, 7:19 PM

#

i've reran it again

sand reef Jun 15, 2019, 7:19 PM

#

So. Nope. I think that was luck.

prisma verge Jun 15, 2019, 7:20 PM

#

i mean third time cannot be luck

sand reef Jun 15, 2019, 7:20 PM

#

In reverse?

prisma verge Jun 15, 2019, 7:20 PM

#

by in reverse i mean it could have outputted franklin trevor michael but i don't remember since my memory is goldifshes memory

#

i'm rerunning it

sand reef Jun 15, 2019, 7:20 PM

#

Well. If you do get 1000s worth of gift cards, send a couple thousand here

prisma verge Jun 15, 2019, 7:21 PM

#

and if it outputs good result third time

sand reef Jun 15, 2019, 7:21 PM

#

😂

prisma verge Jun 15, 2019, 7:21 PM

#

i'll donate euro 2000 to you, get euro 2000 for me, and donate euro 1000 to somebody

#

:p

sand reef Jun 15, 2019, 7:21 PM

#

Sure!

prisma verge Jun 15, 2019, 7:22 PM

#

that's if our magic ai actually works

sand reef Jun 15, 2019, 7:22 PM

#

Well. I gtg now. Time to sleep. It's kinda getting late now.

#

Yeah. I actually hope it works.

prisma verge Jun 15, 2019, 7:22 PM

#

well, 10 pm here, not that much

#

good night!

sand reef Jun 15, 2019, 7:22 PM

#

11.22 PM here

#

Good knight!

prisma verge Jun 15, 2019, 7:25 PM

#

... it made it again

#

michael trevor franklin

#

predicting it again on new dataset

#

let's see it it'll talk truth

#

... okay now it always outputs michael trevor franklin

#

even with new data

#

weird

#

it now outputs same results over and over

#

2.283496 2.372942 2.0555158
it says now that trevor will be least, franklin biggest, michael middle

#

xen0remind
(putting this to search for it later when heists end)

jagged stump Jun 15, 2019, 8:31 PM

#

I will share something about data science ML and DL maybe you guys know it but I guess that cheat sheet will be helpfull for you

#

ML Structure

📎 AI_Structure.jpeg

#

Deep Learning (Coursera Notes)

📎 DeepLearning.pdf

sand reef Jun 16, 2019, 12:05 AM

#

Thanks a lot @jagged stump !

small ore Jun 16, 2019, 12:05 AM

#

450+ msgs since yesterday. This channel is suddenly rocking

silent swan Jun 16, 2019, 2:43 AM

#

the summer of "how do I get started on ml/dl?" 😄

lean ledge Jun 16, 2019, 3:07 AM

#

"I'm really interested in AI and ML" - literally every first and second year computer science student ever

noble ledge Jun 16, 2019, 7:11 AM

#

🚶

sand reef Jun 16, 2019, 8:30 AM

#

But 3rd year?

#

And what do I learn to get into Computer Vision? I am seeing a lot of stuff and feeling overwhelmed and don't know where to begin from. I know how to use Regular conv NNs.

lean ledge Jun 16, 2019, 8:36 AM

#

I thought I already told you what is expected out of a job

#

📎 Screenshot_20190616-183649_Discord.jpg

#

@sand reef

#

The list is completely different if you want to go for traditional computer vision over deep vision

jagged stump Jun 16, 2019, 10:14 AM

#

If you need something else about documemnts I have so many

#

I can share guys

#

AI Basics

📎 ai-basics.png

#

ML

📎 ML.pdf

#

Here so many cheatsheet

📎 CheatSheets.pdf

sand reef Jun 16, 2019, 1:18 PM

#

Holy...

#

Well. I guess I'll do something about it then

#

So, basically, ML and DL isn't as hyped up as it appears to be? @lean ledge

earnest prawn Jun 16, 2019, 3:21 PM

#

it is as hyped up as it appears to be in the public

#

just not the industry or at least just in the minority of industry

jagged stump Jun 16, 2019, 3:59 PM

#

I have a question . What is the best way for real time brand detected ? Any idea?

#

CNN etc which one you advice

onyx granite Jun 16, 2019, 5:04 PM

#

a large majority of major models used by large institutions are still more "traditional" machine learning, partly because they work very well still and partly because DL/ML is harder to "explain" to regulators than a regression type model

#

you would have to get into more details which would be cumbersome for explaining to less tech-savy people

sand reef Jun 16, 2019, 7:44 PM

#

I have a question from ISLR.

#

This is from linear regression

#

📎 ISLR-3.PNG

#

These are the parameters we have to minimize, and we calculate them using this formula (above)

#

📎 ISLR-2.PNG

#

This is given formula for calculating the variance of the means. (above)

#

📎 ISLR-1.PNG

#

How are we able to derive those? (above)

#

What I know:

#

2. Standard error is measured for the "deviation" between the "means" taken from random samples  and the true mean from a given dataset. ```

#

So, what am I missing here? I can't seem to derive this weird formula. Nor am I able to logically deduce this either.

#

ISLR First Printing (Page 66)

#

Link to pdf: https://www-bcf.usc.edu/~gareth/ISL/ISLR First Printing.pdf

prisma verge Jun 16, 2019, 8:46 PM

#

@sand reef well it kinda predicted it but trevor was middle not michael

#

franklin was biggest still though

sand reef Jun 16, 2019, 11:39 PM

#

Nice!

silent swan Jun 17, 2019, 1:10 AM

#

@sand reef I'm not sure which part you're having issues with

lapis sequoia Jun 17, 2019, 4:21 AM

#

I need some help

#

there's three classes in my dataframe, how do I limit them to 100K rows each

sand reef Jun 17, 2019, 4:31 AM

#

@silent swan how the standard error formulas are being derived for the square of the parameters.

silent swan Jun 17, 2019, 6:18 AM

#

ah, I would say for those you'll actually need to get into real statistics

#

like a standard regression textbook or regression chapter in a statistics textbook will derive them

lost sinew Jun 17, 2019, 7:38 AM

#

import pandas as pd

markets = ['ETH', 'BTC', 'ADA', 'ETC', 'EOS', 'NEO', 'BCHABC']
pair = 'USDT'
dfs = []
start_time = '2018-01-01'
end_time = '2019-06-15'

for market in markets:
    df = pd.read_csv(market + pair + '.csv', parse_dates=True, index_col=0)

    df = df.drop(columns=['Open', 'High', 'Low', 'Volume'])
    df.rename(columns={'Close': market + pair + ' Close'}, inplace=True)
    df = df.loc[start_time: end_time]
    dfs.append(df)

#

there is like 7 columns.. each with the closing price of each coin.. is there a way i could add all of the prices and average it and make it a separate column called 'average'

sand reef Jun 17, 2019, 8:49 AM

#

there is

#

if i am not wrong, there should either be a mean function in pandas itself, if not, i think numpy will work as well

#

cuz, again, if i m not wrong, pandas was built on numpy

wide gyro Jun 17, 2019, 1:06 PM

#

so i have a function like this

#

def epochUpd(row):
    row.at['updated'] = time.strftime("%a, %b %d %Y", time.localtime(row.at['updated']))```

#

and i have another function called clean which does something like this

#

def clean(df)
    for index,row in df.iterrows():
        epochUpd(row)
    df.to_csv('updated.csv',index=False)

#

Basically, I am trying to clean a data file, but I want to change every 'updated' column to show the value I created from the epochUpd(row)

#

Firstly, I don't think the iterrows() is the right way to go because I feel like it'll take a while and there could be a much easier way

#

Also, I'm not sure if the formatting is even near correct

#

Could I change the epoch function to

#

def epochUpd(df)
    df.updated = time.strftime("%a, %b %d %Y", time.localtime(df.updated))

#

and then keep the for loop how it is in the clean function

#

nevermind, getting invalid syntax on that last function

#

but there's gotta be a way

lapis sequoia Jun 17, 2019, 4:08 PM

#

may i know whats wrong with this code

import tensorflow as tf
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from keras import Sequential
from keras.layers import Dense, Flatten

x, y = load_digits().data, load_digits().target
x_train, x_test, y_train, y_test = train_test_split(x, y, shuffle=False)
model = Sequential()
model.add(Dense(64, activation=tf.keras.activations.relu))
model.add(Dense(64, activation=tf.keras.activations.relu))
model.add(Dense(10, activation="softmax"))
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=3, batch_size=10)
print(model.predict(x_test[0]))```

#

  File "/Users/Kushi/PycharmProjects/test/misc/ML_test.py", line 19, in <module>
    model.fit(x_train, y_train, epochs=3, batch_size=10)
  File "/Users/Kushi/PycharmProjects/test/venv/lib/python3.7/site-packages/keras/engine/training.py", line 952, in fit
    batch_size=batch_size)
  File "/Users/Kushi/PycharmProjects/test/venv/lib/python3.7/site-packages/keras/engine/training.py", line 789, in _standardize_user_data
    exception_prefix='target')
  File "/Users/Kushi/PycharmProjects/test/venv/lib/python3.7/site-packages/keras/engine/training_utils.py", line 138, in standardize_input_data
    str(data_shape))
ValueError: Error when checking target: expected dense_3 to have shape (10,) but got array with shape (1,)

#

but in the load_digits dataset there are 10 class names so shouldnt i have dense 3 ten so that it gives me the probability of each one?

#

please ping me when helping

prisma verge Jun 17, 2019, 4:34 PM

#

@sand reef
['Trevor', 'Franklin', 'Micheal']
[[2.3735387 2.3416874 1.861764 ]]
let's see if it works :p

#

if it'll guess biggest team third time

#

that'd be cool

brazen wing Jun 17, 2019, 5:04 PM

#

@lapis sequoia you didn't give it that though, you gave it a single int

prisma verge Jun 17, 2019, 5:18 PM

#

HUH

#

IT DID GUESS THAT MICHAEL WILL BE BIGGEST

#

still weird with middle and smallest team

#

it mixes them up so trevor was middle

#

but yeah it predicts biggest team so that's good already

#

['Franklin', 'Micheal', 'Trevor']
[[2.1652157 1.9663016 2.2141366]]

#

let's see if next game will be right too

#

if yeah...

#

then that's a gem

daring spindle Jun 17, 2019, 6:09 PM

#

Hey guys what is the best place to learn ML and Neural Networks?

olive willow Jun 17, 2019, 6:14 PM

#

dude first make a mind map what you need to start learning it because you need for example programming fundamentals, linear algebra, stats and later also calc

#

to properly understand it

prisma verge Jun 17, 2019, 6:15 PM

#

cough no you don't cough

olive willow Jun 17, 2019, 6:15 PM

#

assuming you know more complex algebra and functions and how to use them

#

?

#

to really understand it yes you do

prisma verge Jun 17, 2019, 6:15 PM

#

for building networks just for fun you can use keras

#

also there's amazing Zero to Deep Learning course which covers all the stuff you need including math explainatory

olive willow Jun 17, 2019, 6:16 PM

#

yes but building and understanding are two different places

prisma verge Jun 17, 2019, 6:16 PM

#

https://www.zerotodeeplearning.com/

Zero To Deep Learning

Fullstack Deep Learning tutorials to go from zero to production covering all basics and advanced concepts.

daring spindle Jun 17, 2019, 6:16 PM

#

Math was my best finals subject I understand it ez

prisma verge Jun 17, 2019, 6:17 PM

#

but can you do linear algebra?

olive willow Jun 17, 2019, 6:17 PM

#

he's 13

prisma verge Jun 17, 2019, 6:17 PM

#

why do you think so tho

olive willow Jun 17, 2019, 6:17 PM

#

btw so you haven't had that in school

#

I know him

prisma verge Jun 17, 2019, 6:17 PM

#

... oh

daring spindle Jun 17, 2019, 6:17 PM

#

Erm I am 14 now sur

olive willow Jun 17, 2019, 6:17 PM

#

happy birthday then!

daring spindle Jun 17, 2019, 6:17 PM

#

The real man heroh

#

almost got my beard ready

prisma verge Jun 17, 2019, 6:18 PM

#

really though check out ZTDP course

#

it covers all subjects you'll need

olive willow Jun 17, 2019, 6:18 PM

#

I'm using datacamp

prisma verge Jun 17, 2019, 6:18 PM

#

with source codes, jupyter notebooks, and stuff

olive willow Jun 17, 2019, 6:18 PM

#

but I'm going into DS

#

and robotics

prisma verge Jun 17, 2019, 6:18 PM

#

i'm planning to go web infosec so
i'm just doing nns for fun

olive willow Jun 17, 2019, 6:19 PM

#

btw khan academy is good for math

prisma verge Jun 17, 2019, 6:19 PM

#

like text generation, prediction, stuff

olive willow Jun 17, 2019, 6:19 PM

#

oohh sure

prisma verge Jun 17, 2019, 6:19 PM

#

and it doesn't require much math, just basic array understanding

olive willow Jun 17, 2019, 6:19 PM

#

thats ez

prisma verge Jun 17, 2019, 6:19 PM

#

and python and numpy syntax knowledge

olive willow Jun 17, 2019, 6:19 PM

#

I actually need to know a lot to do good DS

#

SQL, numpy, pandas, ML, math etc.

#

python

#

a lot of visualizing tools

prisma verge Jun 17, 2019, 6:20 PM

#

yes but that's if you go for DS as a job

#

i'm building projects just to laugh them off with friends :p

olive willow Jun 17, 2019, 6:20 PM

#

oohh hahahahahaha

#

I want to have an own robotics company which is using AI to make them human like

prisma verge Jun 17, 2019, 6:21 PM

#

i've made a thing that'd make poem-like neuromancer quotes

olive willow Jun 17, 2019, 6:21 PM

#

oooohhh that cool I guess

prisma verge Jun 17, 2019, 6:21 PM

#

i also want to learn markov chains but my brain refuses to understand their realizations in python

olive willow Jun 17, 2019, 6:21 PM

#

whats that?

earnest prawn Jun 17, 2019, 6:22 PM

#

a stochastical model

#

for state transistions

olive willow Jun 17, 2019, 6:22 PM

#

nix out of nowhere

prisma verge Jun 17, 2019, 6:22 PM

#

oh no by mentioning scientific words you summon nix

olive willow Jun 17, 2019, 6:22 PM

#

sure

earnest prawn Jun 17, 2019, 6:22 PM

#

i have ben watching for around 10 minutes actually

olive willow Jun 17, 2019, 6:22 PM

#

hahahhaahha

prisma verge Jun 17, 2019, 6:22 PM

#

may we get pinned picture here with

#

BIG NIX IS WATCHING YOU

olive willow Jun 17, 2019, 6:23 PM

#

yh

#

why do you even need cameras or monitoring bots if you have NIX

earnest prawn Jun 17, 2019, 6:24 PM

#

because I dont convert to jpegs

prisma verge Jun 17, 2019, 6:24 PM

#

i kinda get markov chains basics like
there's a bunch of words and every word has a chance to follow the other word

earnest prawn Jun 17, 2019, 6:24 PM

#

no

prisma verge Jun 17, 2019, 6:24 PM

#

... but

earnest prawn Jun 17, 2019, 6:24 PM

#

that is not the deeper idea of markov chains

prisma verge Jun 17, 2019, 6:24 PM

#

where can i get simple explaination of MC

earnest prawn Jun 17, 2019, 6:24 PM

#

the deeper idea of markov chains is that you have a set of states and then you have a transitions probability from every state to every state

#

that is everything there is

#

the words are just a form of states

prisma verge Jun 17, 2019, 6:25 PM

#

oh
so MC are just counting probabilities between transitions which can be any object like

#

weather, number

#

kinda like that?

earnest prawn Jun 17, 2019, 6:25 PM

#

counting probabilities?

sand reef Jun 17, 2019, 6:25 PM

#

Say. Has anyone heard of reservoir computing? What is it about?

prisma verge Jun 17, 2019, 6:26 PM

#

compute probabilities
how do i say it

#

i can't say it with right words

#

that's not what i mean

earnest prawn Jun 17, 2019, 6:26 PM

#

the markov chain itself does not compute probabilities

#

it expects you to provide probabilities and states

prisma verge Jun 17, 2019, 6:26 PM

#

yeah i get that

earnest prawn Jun 17, 2019, 6:26 PM

#

all the libraries you see for markov chains just do this for you

prisma verge Jun 17, 2019, 6:27 PM

#

oh

#

there's libs for markov chains?

earnest prawn Jun 17, 2019, 6:27 PM

#

dozens

prisma verge Jun 17, 2019, 6:27 PM

#

may i just get fancy text generator

#

because i love random stuff

#

and markov chains examples seem to be like human words but still with a bit of computer craziness

earnest prawn Jun 17, 2019, 6:28 PM

#

i mean you can easily implement stationarymarkov chains yourself

#

it gets a bit ugly with mth order markov chains because you have to look m states back for your new decision

uneven wren Jun 17, 2019, 6:29 PM

#

is there a way to access printed information?

earnest prawn Jun 17, 2019, 6:29 PM

#

you read from stdout?

#

also how does this fit here?

uneven wren Jun 17, 2019, 6:29 PM

#

hmmm i thought it would be topical to data-science

prisma verge Jun 17, 2019, 6:29 PM

#

but i don't understand markov chains and i just want to get more random stuff that look humanese but weird

uneven wren Jun 17, 2019, 6:29 PM

#

sorry

earnest prawn Jun 17, 2019, 6:29 PM

#

dont worry

#

so for markov chains, suppose we have three states s1 s2 s3

the transition probabilities could for example be
0.4 for s1 -> s2
0.8 for s1 -> s3
0.2 for s3 -> s2
0.1 for s3 -> s2
0.9 s2 -> s1

all ommited ones are 0

so if you start out in s1 you check your probabilities and see you have to go to s3
s3 transitions to s2
and s2 transitions to s1 (and now we're lost in an endless loop)

olive willow Jun 17, 2019, 6:32 PM

#

so nix as you're here, can you tell me what this is Parametric representations of lines in linear algebra because the guy in khan academy only confuses me more

desert oar Jun 17, 2019, 6:32 PM

#

can you give context for that

#

what is a parametric representation of lines

prisma verge Jun 17, 2019, 6:33 PM

#

huh

olive willow Jun 17, 2019, 6:33 PM

#

idk that's why I ask the all mighty nix who knows mostly everything

prisma verge Jun 17, 2019, 6:33 PM

#

that now seems easy

olive willow Jun 17, 2019, 6:33 PM

#

https://www.khanacademy.org/math/linear-algebra/vectors-and-spaces/vectors/v/linear-algebra-parametric-representations-of-lines?modal=1

Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance, history, and more. Khan Academy is a nonprofit with the mission of providing a free, world-class education for anyone, anywhere.

earnest prawn Jun 17, 2019, 6:33 PM

#

yes it is extremly easy @prisma verge

prisma verge Jun 17, 2019, 6:33 PM

#

and yeah, nix seem to know everything

olive willow Jun 17, 2019, 6:33 PM

#

pretty much

prisma verge Jun 17, 2019, 6:33 PM

#

praise nix

desert oar Jun 17, 2019, 6:33 PM

#

that's the title of the video, i think you should watch it to find out what it is

olive willow Jun 17, 2019, 6:34 PM

#

I've watched it more than once

#

and it doesn't make sense to me

prisma verge Jun 17, 2019, 6:34 PM

#

may we get :praiseNix: emote

olive willow Jun 17, 2019, 6:34 PM

#

yh

desert oar Jun 17, 2019, 6:34 PM

#

@olive willow is there a specific part of it? or the whole thing is confusing

earnest prawn Jun 17, 2019, 6:35 PM

#

also Id argue that a parametric representation of a line is (from a geometrical point of view)
a vector lets call it a
another vector lets call it b
and a variable scalar l

and then you could represent a line which goes from a in direction b like

a + l * b

but that is my school geometry thinking I dont really know what it is, it just sounds similar

and no I do not know everything in fact id consider myself horribly bad at almost everything related to data science

prisma verge Jun 17, 2019, 6:35 PM

#

also, nix
from what i've heard, to get probability for each word you gotta divide the amount of words on amount of times this word appeared in the text

#

right?

desert oar Jun 17, 2019, 6:35 PM

#

its saying that a vector is a "parametric" representation of a line

#

"parametric" in this case means that it uses a few specific numbers, i.e. parameters, to represent a large collection of objects, in this case a line

earnest prawn Jun 17, 2019, 6:36 PM

#

but a vector represents an infinite amount of lines

desert oar Jun 17, 2019, 6:36 PM

#

or a collection of lines

#

yeah

#

thats the point

olive willow Jun 17, 2019, 6:36 PM

#

ooohohhhh

#

so what vectors you can make with these parameters

prisma verge Jun 17, 2019, 6:37 PM

#

i don't want to hear anything about infinitiness so im getting outta here

olive willow Jun 17, 2019, 6:37 PM

#

it would be infinite mostly

prisma verge Jun 17, 2019, 6:37 PM

#

infinitiness confuse me

#

it's always confusing

#

like, that endless hotel paradox

desert oar Jun 17, 2019, 6:37 PM

#

this isn't that @prisma verge

earnest prawn Jun 17, 2019, 6:37 PM

#

i mean a vector is just a direction, it doesnt define from where it starts just where it points

#

so you can choose every point in space

prisma verge Jun 17, 2019, 6:37 PM

#

VECTOR IS JUST X AND Y CHANGE MY MIND

olive willow Jun 17, 2019, 6:37 PM

#

yh but if we would start at the origin

prisma verge Jun 17, 2019, 6:37 PM

#

derez x and derez y

olive willow Jun 17, 2019, 6:38 PM

#

to make things easy

earnest prawn Jun 17, 2019, 6:38 PM

#

then its not a vector

#

then its what i described above with a = (0|0)

desert oar Jun 17, 2019, 6:38 PM

#

@olive willow @prisma verge the punch line is at around 10:00 in that video -- you can use y = mx + b to describe a line in R2, but a vector can be used to represent an arbitrary "line" in an arbitrarily-dimensioned space

#

hence a vector is a parametric representation of a collection of lines

olive willow Jun 17, 2019, 6:38 PM

#

an arbitrary "line" in an arbitrarily-dimensioned space ?

desert oar Jun 17, 2019, 6:39 PM

#

yeah, what's a line in 10 dimensional space?

prisma verge Jun 17, 2019, 6:39 PM

#

may we just simplify that
there's a lot of dots and you can line to any of dot with x and y coordinates

#

daz vector now

desert oar Jun 17, 2019, 6:39 PM

#

equation is really comlicated

#

cause you dont just have x and y

olive willow Jun 17, 2019, 6:39 PM

#

xyz

#

3dims

earnest prawn Jun 17, 2019, 6:39 PM

#

a line in 10 deimensional space is a plane in 9 dimensional space

desert oar Jun 17, 2019, 6:39 PM

#

can have more than that

earnest prawn Jun 17, 2019, 6:39 PM

#

😜

olive willow Jun 17, 2019, 6:39 PM

#

nix stop!

desert oar Jun 17, 2019, 6:39 PM

#

can have 3, 10, or even infinity

olive willow Jun 17, 2019, 6:39 PM

#

yh

#

1, 2, 3 ....... n dims

desert oar Jun 17, 2019, 6:39 PM

#

so instead of having to write special equations for each case

#

you use vectors

#

(and eventually matrices)

olive willow Jun 17, 2019, 6:40 PM

#

yh I know that

#

and tensors

prisma verge Jun 17, 2019, 6:40 PM

#

also i really respect nix and srl for knowing math because i like people that know math

olive willow Jun 17, 2019, 6:40 PM

#

in numpy

prisma verge Jun 17, 2019, 6:40 PM

#

but myself i hate math

#

but knowing it is cool i guess

olive willow Jun 17, 2019, 6:41 PM

#

A vector is a thing that has a direction and magnitude

desert oar Jun 17, 2019, 6:41 PM

#

thats one way to think of it, yes

olive willow Jun 17, 2019, 6:41 PM

#

and you can place it anywhere you want if it has the same direction and magnitude

desert oar Jun 17, 2019, 6:42 PM

#

@prisma verge can you elaborate on your question about word probabilities

olive willow Jun 17, 2019, 6:42 PM

#

but to simplify things I would see a vector as a scalar of 2 base vectors

desert oar Jun 17, 2019, 6:42 PM

#

@olive willow he gives a good example at around 12 mins in the video

#

how do you characterize the line that goes through 2 specific points

olive willow Jun 17, 2019, 6:42 PM

#

gonna watch

desert oar Jun 17, 2019, 6:42 PM

#

in a general dimensional space

#

using 2d for demo purposes

prisma verge Jun 17, 2019, 6:42 PM

#

uhhh, so, we were talking about markov chains and when i was reading about it i've read that probability of each word being added in the text = amount of words in this text divided by times this word appeared in the text

desert oar Jun 17, 2019, 6:43 PM

#

flip that around

#

and... sorta

prisma verge Jun 17, 2019, 6:43 PM

#

oh

desert oar Jun 17, 2019, 6:43 PM

#

depends. that's one specific model

#

its not always true

prisma verge Jun 17, 2019, 6:43 PM

#

i just want to have nice text generator, really
i love random stuff

earnest prawn Jun 17, 2019, 6:43 PM

#

then get yourself markov chain lib x and youre done

prisma verge Jun 17, 2019, 6:44 PM

#

i mean, it's not like i'm going in data science to know the theory

earnest prawn Jun 17, 2019, 6:44 PM

#

yeah well

#

then youre not really going to get far

prisma verge Jun 17, 2019, 6:45 PM

#

and, i guess simplified version of theory is good enough that was provided by you nix

#

haha

earnest prawn Jun 17, 2019, 6:45 PM

#

thats not really a simplified version

#

its literally all there is behind it

prisma verge Jun 17, 2019, 6:45 PM

#

... oh

earnest prawn Jun 17, 2019, 6:45 PM

#

i mean

#

you can express this a lot more formal

#

with random variables and whatever

#

but it boils down to this idea

prisma verge Jun 17, 2019, 6:46 PM

#

i hate formal things to be honest since i'm stupid and it's easier when you can compare stuff with real life isntead of learning hundreds of new terms

desert oar Jun 17, 2019, 6:47 PM

#

idk how you ever expect to learn anything then

#

real-life examples are valuable and important

#

but if you want to get past the basic level, at some point you have to sit down and try to understand what's going on

#

this goes for many things, not just data science

prisma verge Jun 17, 2019, 6:48 PM

#

i have no idea?

#

i mean, uh, i kinda get python syntax when i'm bad at explaining this at formal side

#

and i even get what i do in code (on high level, not low level tho)

#

i guess i can say "it just works"

#

... no, not about code, about how i'm learning stuff

earnest prawn Jun 17, 2019, 6:52 PM

#

Not undderstanding code will restrain you enormly in the long term

prisma verge Jun 17, 2019, 6:52 PM

#

but i'm saying that i do get it

#

i'm just bad at getting things when they're explained with a lot of programming and math only terms

#

and that's weird

lapis sequoia Jun 17, 2019, 8:12 PM

#

pretty new to data science, but currently trying to analyse some sensor output. i am making pandas plots and checking if the sensor behaves "nice" according to a temperature cycle over time. are there some tools or methods in pandas which might help me discover sensor "noise" or "shifts" in data? like for example gradually changing offset over time?

median siren Jun 17, 2019, 8:35 PM

#

Hi all, i'm not sure if this is the correct place to ask, but is there anyone who's versed with python Learning To Rank packages? Especially Pairwise ranking algorithms? I'm struggling quite a bit with implementing it in python.

earnest prawn Jun 17, 2019, 10:55 PM

#

Ranking packages?

median siren Jun 17, 2019, 11:29 PM

#

Yes, for example:

pyltr: https://github.com/jma127/pyltr

GitHub

jma127/pyltr

Python learning to rank (LTR) toolkit. Contribute to jma127/pyltr development by creating an account on GitHub.

#

XGoost LTR: https://github.com/dmlc/xgboost/tree/master/demo/rank

GitHub

dmlc/xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow - dmlc/xgboost

lapis sequoia Jun 18, 2019, 1:58 AM

#

@median siren you have to be more specific

#

what are you trying to rank

#

@lapis sequoia auto regressive methods

median siren Jun 18, 2019, 2:44 AM

#

@lapis sequoia Apologies, i'll try to be more specific.

For example: consider the following table:

| Name         | Rank  | Points | Embedding  |
|-------------|--------|---------|--------------|
| Man. City  | 1           |   98       |  [....]              | 
| Liverpool   | 2          |   97       |   [...]               |
| Tot. Hots.  | 3          |   93       |   [...]               |

Here, we try to predict the rank of a team, according to their embedding. So there is 1 query, namely the results of this season, and you want to predict the rank of a team based on embeddings. (Note: the embeddings can also be replaced with other features)

lapis sequoia Jun 18, 2019, 2:45 AM

#

what does the embedding represent..

#

what is the query

median siren Jun 18, 2019, 2:53 AM

#

Well, to be frank I'm quite doubtful about that, Considering the goal is to "predict a rank", I assume the query is the rank, and you want to have the most accuare name with that query.

So in this case, if the query would be "1" You'd want Man. City to return. This considering the opposite doesn't hold true, I assume?. For example: When you query "Man. City." the result can't be "1", no?

#

what does the embedding represent..

It's a graph embedding, so the embedding represent the name in a graph.

lapis sequoia Jun 18, 2019, 3:02 AM

#

I'm not able to understand the case here.. it's very cryptic..

#

you want to predict the rank of football clubs.. but what are the features exactly?

#

it's very rare for features other than text to be represented as embeddings..

lean ledge Jun 18, 2019, 3:55 AM

#

That's not true. A lot of things use embeddings. It's less common to call them embeddings outside NLP but the idea of taking a representation, finding patterns in it and embedding it into a lower dimensional latent space that's more fundamental is a very common thread throughout ML

lapis sequoia Jun 18, 2019, 4:10 AM

#

can't refer to all vector representations as embeddings..because they are specific task-restricted representations..

lapis sequoia Jun 18, 2019, 5:16 AM

#

@lapis sequoia thanks, Ill look into that

desert oar Jun 18, 2019, 5:21 AM

#

@lapis sequoia tbh python has bad tools for time series analysis. R has much better stuff for decomposition, trends, etc

#

@lapis sequoia isn't that pretty much the definition of an embedidng? an invertible mapping into a lower-dimensional space?

#

theres nothing specific to text in that

lapis sequoia Jun 18, 2019, 5:24 AM

#

I don't know.. it feels very odd to me to use that term for anything other than text o.o

lean ledge Jun 18, 2019, 6:02 AM

#

What makes text embeddings different from non text ones?

lapis sequoia Jun 18, 2019, 6:34 AM

#

lets see.. in text embeddings there's cases like context free and context aware..

#

in other vector representations.. the features are represented in a certain space and can be reduced to another, just like embeddings that represent text..

#

features for a vector representation are often chosen for predictability.. taking into things like avoiding multicollinearity.. this is the opposite of context aware representations in text, which are representations that take into account their surrounding text..

#

the idea of embeddings is that the representation can be used for multiple related nlu/nlg tasks or in a related domain..

#

apparently they also use a term called Vector Embeddings to denote outputs from DL layers..

#

https://www.aclweb.org/anthology/W17-7302

formal badger Jun 18, 2019, 7:35 AM

#

How does everyone have their python environments setup?

#

Good idea to just install anaconda?

lapis sequoia Jun 18, 2019, 7:37 AM

#

pika pi?

#

I used to use anaconda for university.. then was doing everything on the cloud.. so.. make a script for launching a quick instance with everything I need

#

pikachu

elder vessel Jun 18, 2019, 7:47 AM

#

I am using miniconda with conda-forge as source. Then I conda install or pip install what is needed and set this env for the project in PyCharm (or whatever IDE) settings. If I use a lot of packages, I am uploading a venv.yml export along with the git repo.

polar acorn Jun 18, 2019, 7:50 AM

#

I just use anaconda works fine for me. I use conda environments as well.

lean ledge Jun 18, 2019, 8:13 AM

#

@lapis sequoia Vector embeddings are just embeddings. They are represented as a vector, same as text embeddings. Also not just used for deep learning. Deep learning is just very often used for things that lower dimensionality. Autoencoders' entire point is to create embeddings and classification tasks naturally create embeddings. Also a lot of embeddings use some sort of contextual information. Context around in text is just another part of what makes the information's position. In images, it would be the larger context around some area. It's not that different, really.

lapis sequoia Jun 18, 2019, 10:20 AM

#

@lapis sequoia doing a lag plot of a large dataset (some 600000ish rows) certainly took its time.. maybe I am doing this wrong. Even tried to slice out every other row to "half" the data but that still took ages.

median siren Jun 18, 2019, 10:33 AM

#

@lapis sequoia , Exactly as @lean ledge says, In my specific case I use a graph that I embed. A graph embedding can be seen similar to text embedding. The context of a word is reffered to as the probability of two words occuring next to eachother, or near eachother in a sentence. In a graph, the context of a node is reffered to as the probability of a node being a neighbour to another node.

Then, after that, I want to predict the ranking of those nodes based on their graph features, which in my case is represented in embeddings.

silent swan Jun 18, 2019, 11:43 AM

#

I'm still not sure I'm comfortable with people now using "embedding" to refer to any kind of hidden state

lean ledge Jun 18, 2019, 11:53 AM

#

What do you mean by hidden state?

#

Is a vector in a latent space considered hidden?

silent swan Jun 18, 2019, 12:15 PM

#

I'm referring to just any intermediate state in a neural network model

#

people have started referring to those as embeddings too

void anvil Jun 18, 2019, 12:40 PM

#

@desert oar that's entirely false lol

desert oar Jun 18, 2019, 12:40 PM

#

@void anvil ?

void anvil Jun 18, 2019, 12:40 PM

#

There are just as many, if not more, tools for TS in python and they're a lot faster

desert oar Jun 18, 2019, 12:40 PM

#

Do tell

#

Seriously it would be great to not have to call out to rpy2 for stuff like time series decomposition and arima

#

Statsmodels doesnt count

void anvil Jun 18, 2019, 12:44 PM

#

https://github.com/tgsmith61591/pmdarima

GitHub

tgsmith61591/pmdarima

A package that brings R's beloved auto.arima to Python, making an even stronger case for why Python > R for data science. - tgsmith61591/pmdarima

lean ledge Jun 18, 2019, 1:07 PM

#

@silent swan What's wrong with use of the term embedding for that? It's generally only the last layer or two that are called embeddings, not any random hidden layer. It creates feature vectors that are separable according to a quality that we care about (which we used to classify) so it's honestly very much an embedding.

#

It's not purposefully made to be any kind of encoder but it certainly works that way given how neural networks seem to work experimentally (I'm not sure of any theoretical papers that exist on how semantically focused information comes out in later layers when training for a classification task etc. Would be nice to read if anyone's got one)

lapis sequoia Jun 18, 2019, 1:56 PM

#

I'm looking for some help with numpy's bincount function

#

and trying to use bincount to make an azimuthal average but to ignore radii where data is completely missing but still have the missing value show up at that radius in the average

silent swan Jun 18, 2019, 2:45 PM

#

@granite bobcatotboi one pattern that I've seen for example, is where transfer learning is performed and the output of the model being reused/transferred is referred to as an embedding

#

depending on what you refer to as classification (sounds like you're more referring to the cases of few-class classification, otherwise even generation tasks are classification over the vocabulary), I can probably pull up some NLP papers for that

#

I would contrast "embedding" with "representation", which I think is more appropriate for this usage. On one hand, representation is broad enough to basically say nothing other than "contains information", but on the other hand that's probably more appropriate, because in these cases we're just taking some intermediate transformation of the input. In contrast, I think an embedding is a more specific term that say something more about the space/transformation being applied

lapis sequoia Jun 18, 2019, 4:48 PM

#

@brazen wing sry for ping but if u mind can u help me fix it?

#

tomorrow is fine too

brazen wing Jun 18, 2019, 4:49 PM

#

sure

#

you want it to give you an array of size 10

#

with a 1 for whatever the right category is. but your training data provides the y value as an int from 0 to 9

#

you just need to convert that from a 1d array to a 2d array with 1's in the corresponding int index

#

📎 unknown.png

#

this is what i did

#

@lapis sequoia

bitter pewter Jun 18, 2019, 7:09 PM

#

'sup guys!

I'm developing a code to get into a javascript environment, then I want to scrape the data from the website using BeautifulSoup. The point is that I realized that there isn't any table in the environment, so I was wondering about how can I scrape the data from the website.

Any tips?

MWE:

from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import re
import pandas as pd
from tabulate import tabulate
import os

url = "https://scon.stj.jus.br/SCON/legaplic/toc.jsp?materia=%27Lei+8.429%2F1992+%28Lei+DE+IMPROBIDADE+ADMINISTRATIVA%29%27.mat.&b=TEMA&p=true&t=&l=1&i=18&ordem=MAT,@NUM"

driver = webdriver.Firefox()
driver.implicitly_wait(30)
driver.get(url)

python_button = driver.find_element_by_xpath('/html/body/div[2]/div[6]/div/div/div[3]/div[2]/div/div/div/div[16]/a')
python_button.click()

driver.switch_to.window(driver.window_handles[-1])

python_button = driver.find_element_by_xpath('/html/body/div[2]/div[6]/div[1]/div/div[3]/div[2]/div/div/div/div[3]/div[2]/span[2]/a')
python_button.click()

driver.switch_to.window(driver.window_handles[-1])

pagina_de_resultados = BeautifulSoup(driver.page_source, 'lxml')

table = pagina_de_resultados.find_all('table')

df = pd.read_html(str(table), header=0)

datalist.append(df[0])

x += 1

driver.quit()

result = pd.concat([pd.DataFrame(datalist[i]) for i in range(len(datalist))], ignore_index=True)

json_records = result.to_json(orient='records')

print(tabulate(result, headers=["Employee Name", "Job Title", "Overtime Pay", "Total Gross Pay"], tablefmt='psql'))

path = os.getcwd()

f = open(path + "\fhsu_payroll_data.json", "w")
f.write(json_records)
f.close()```

#

Actually getting the following Traceback:

  File "C:/Users/Pedro/PycharmProjects/data_scraping/data_scraping.py", line 29, in <module>
    df = pd.read_html(str(table), header=0)
  File "C:\Users\Pedro\PycharmProjects\data_scraping\venv\lib\site-packages\pandas\io\html.py", line 1094, in read_html
    displayed_only=displayed_only)
  File "C:\Users\Pedro\PycharmProjects\data_scraping\venv\lib\site-packages\pandas\io\html.py", line 916, in _parse
    raise_with_traceback(retained)
  File "C:\Users\Pedro\PycharmProjects\data_scraping\venv\lib\site-packages\pandas\compat\__init__.py", line 420, in raise_with_traceback
    raise exc.with_traceback(traceback)
ValueError: No tables found```

quaint ruin Jun 18, 2019, 7:21 PM

#

Hi, I wish to calculate the weighted average of n data points, while lowering 2 specific data points to value of 1 arbitrary and adjusting the other data points so that the ratio of the weight is kept as-well as the weighted average. How can I achieve this?
For example I have the following data points and weights:

value1 - 1.186, weight1 - 100
value2 - 1.294, weight2 - 50
value3 - 2.157, weight3 - 200
value4 - 3.235, weight4 - 150

average - 2.2

I wish to set value1 and value2 to 1 and adjust value3 and value3 to satisfy my condition as to keep the weights ratio and the weighted average.

desert oar Jun 18, 2019, 8:30 PM

#

@void anvil nice, thanks for that. doesn't cover everything i'd want but it's a start

#

still would be looking for seasonal decomposition

void anvil Jun 18, 2019, 8:32 PM

#

That’s literally all in statsmodel

#

And a multitude of other packages

wary fox Jun 18, 2019, 8:43 PM

#

Just throwing this out there because I'm interested. Does anyone know how common it is for people to pair computer science degrees and biology degrees in college like in a double major? I'm have to pick my college major soon and I love both but am not sure if its a thing that is possible.

lapis sequoia Jun 18, 2019, 8:58 PM

#

@wary fox it is.. only if you're into bio statistics.. plan to go work for pharma or a clinical research organization..

#

or maybe consulting.. where you're advising those sorts of companies

wary fox Jun 18, 2019, 8:59 PM

#

@lapis sequoia interesting. So its not common for field biologist to have computer science degrees?

lapis sequoia Jun 18, 2019, 8:59 PM

#

what's a field biologist.. :v

wary fox Jun 18, 2019, 8:59 PM

#

https://study.com/articles/Field_Biologist_Job_Description_Duties_and_Requirements.html

Study.com

Field Biologist: Job Description, Duties and Requirements

People searching for Field Biologist: Job Description, Duties and Requirements found the links, articles, and information on this page helpful.

lapis sequoia Jun 18, 2019, 9:00 PM

#

I'm really not sure of what other prospects biologists have.. aside from ones related to data science..

wary fox Jun 18, 2019, 9:02 PM

#

Oh ok, thanks for your input though! Its much appreciated.

brazen wing Jun 18, 2019, 10:02 PM

#

@bitter pewter dont use a form then? you can just select the text you want using an id or name

bitter pewter Jun 18, 2019, 10:10 PM

#

Hey @brazen wing can you give me some light? Maybe a documentation link or any kind of guide... I’m kinda lost LOL newbie at Python

brazen wing Jun 18, 2019, 10:13 PM

#

sure I'll try. but what data are you actually trying to get

bitter pewter Jun 18, 2019, 10:14 PM

#

Have you tried the code? It redirects you to a public database of a tribunal court. That database have some judgements infos. I want to get that info and copy to a table. I’ll try to show you! Give me a sec.

#

As you can see, I want to get the info into “Processo”, “Relator(a)” etc. fields

📎 image0.png

brazen wing Jun 18, 2019, 10:16 PM

#

ill just get up the webpage

bitter pewter Jun 18, 2019, 10:16 PM

#

Sure! I’ll get you the url

#

https://scon.stj.jus.br/SCON/legaplic/toc.jsp?materia='Lei+8.429%2F1992+(Lei+DE+IMPROBIDADE+ADMINISTRATIVA)'.mat.&b=TEMA&p=true&t=&l=1&i=18&ordem=MAT,@NUM

#

You should click on § 6º text

#

And then click in “41 documento(s) encontrado(s)”

brazen wing Jun 18, 2019, 10:18 PM

#

ah ok i see

#

well first of all i don't think you are selecting this

#

when you click in selenium

bitter pewter Jun 18, 2019, 10:19 PM

#

Oh, it’s going right to where I’ve pointed

#

It works well till I try to get the data

brazen wing Jun 18, 2019, 10:20 PM

#

yeah its a bunch of divs

bitter pewter Jun 18, 2019, 10:20 PM

#

Yeah LOL

brazen wing Jun 18, 2019, 10:21 PM

#

well the class of both the entires

#

is docTexto

#

im just getting it up in python

bitter pewter Jun 18, 2019, 10:23 PM

#

Sure. I’ve saw that we have docTexto as class and a bunch of divs about those fields. I’m not sure that there is a pattern about those divs names.

olive pine Jun 18, 2019, 10:31 PM

#

Anyone know any ML groups?

lean ledge Jun 18, 2019, 10:32 PM

#

There's an AI server, a data science server, /r/learnmachinelearning server

#

Channels on every major programming discord

#

I never liked any of them much though so your mileage may vary

#

Smart ML people seem to not use discord too much

brazen wing Jun 18, 2019, 10:33 PM

#

@bitter pewter ok its the first two elements of class docTexto

#

so you can use driver.find_elements_by_class_name("docTexto")

#

and that will give you a list

#

of the elements

olive pine Jun 18, 2019, 10:34 PM

#

@lean ledge I am a beginner.

bitter pewter Jun 18, 2019, 10:35 PM

#

I’ll use it with append?

brazen wing Jun 18, 2019, 10:35 PM

#

Well you will need to get the text from them.

#

so it will be like

#

textList = driver.find_elements_by_class_name("docTexto")

lean ledge Jun 18, 2019, 10:36 PM

#

Other beginners are not who you want to talk to as a beginner

brazen wing Jun 18, 2019, 10:36 PM

#

and then textList[0].text will give you a string of Processo field

olive pine Jun 18, 2019, 10:37 PM

#

Yes, I am looking for a group that knows what they are doing.

bitter pewter Jun 18, 2019, 10:37 PM

#

I see! 0-X will be the position of the field I want

brazen wing Jun 18, 2019, 10:37 PM

#

yeah thats right

bitter pewter Jun 18, 2019, 10:37 PM

#

Nice! How do I get that texts and makes a table from it?

#

I want to get it as a spreadsheet

#

table = textList ?

brazen wing Jun 18, 2019, 10:39 PM

#

well you will probably need to construct it with pandas

bitter pewter Jun 18, 2019, 10:39 PM

#

I see!

#

Np, I will study about pandas later

#

As you can see, I’ve kinda copied the code I’m using

#

I’m trying to understand it time by time

brazen wing Jun 18, 2019, 10:40 PM

#

yes it did seem that way haha. an ambitious start

#

pandas is great though, so learning it will be useful

bitter pewter Jun 18, 2019, 10:40 PM

#

I’m trying to use it as my undergraduate thesis in Law

#

🤣

#

I’ll make an analysis of a Court’s decisions group

#

If it’s ok, I’ve added you in my friends list. I won’t bother you 🤣

brazen wing Jun 18, 2019, 10:42 PM

#

yeah no problem. you can probably just message me in here anyway

#

but yeah if you just do a simple tutorial on panda, it should be obvious how to make a table of strings then

bitter pewter Jun 18, 2019, 10:43 PM

#

Sure!

#

Thanks for your help, Bozo

#

You amazing!

brazen wing Jun 18, 2019, 10:43 PM

#

no prob dude

proper sierra Jun 18, 2019, 11:20 PM

#

Hi I made a little data mining / data science project.
Feel free to check it out.😄
https://github.com/tomg404/Zeit-Scraper

GitHub

tomg404/Zeit-Scraper

Little Data Mining Project. Contribute to tomg404/Zeit-Scraper development by creating an account on GitHub.

earnest prawn Jun 18, 2019, 11:30 PM

#

@proper sierra looks a little inspired by the SPIEGEL mining talk at 34C3 right?

proper sierra Jun 18, 2019, 11:58 PM

#

@earnest prawn yes xd

📎 unknown.png

lapis sequoia Jun 19, 2019, 6:34 AM

#

hey this is pretty cool

strong flare Jun 19, 2019, 8:19 AM

#

HI guys, May i know how can i transform this dateset on python like do on excel ?

📎 unknown.png

#

like this on python

📎 unknown.png

#

may i know the first pic is showing or keep loading ?

worldly ruin Jun 19, 2019, 8:26 AM

#

just keeps loading

strong flare Jun 19, 2019, 8:34 AM

#

Hi this is the first pic look like sorry for waiting

📎 unknown.png

brazen wing Jun 19, 2019, 9:10 AM

#

@strong flare you're using pandas right?

strong flare Jun 19, 2019, 9:27 AM

#

yes

#

@brazen wing with jupyter notebook

#

do you need the dataset in excel ?

brazen wing Jun 19, 2019, 9:28 AM

#

that just means you will have it in a csv

#

or you should have it in a csv

#

you can just import from the csv with pandas, and then you will need to create a new dataframe with the headings you want

strong flare Jun 19, 2019, 9:29 AM

#

i have dataset like this

📎 mcd_vs_kfc_eg11.xlsx

#

i want to do it like this

📎 unknown.png

brazen wing Jun 19, 2019, 9:31 AM

#

right. do you know how to construct a dataframe?

strong flare Jun 19, 2019, 9:32 AM

#

i m learning right now but i dont know why show out this error
ValueError: x and y must have same first dimension, but have shapes (3174,) and (1509, 4)

#

can you help me in code

brazen wing Jun 19, 2019, 9:34 AM

#

well I would do a tutorial on pandas first to understand the basics

#

it will make it easier to understand

strong flare Jun 19, 2019, 9:35 AM

#

ok you can use my dataset to do the tutorial

brazen wing Jun 19, 2019, 9:36 AM

#

i mean look one up in google

strong flare Jun 19, 2019, 9:39 AM

#

also can

#

i tried to search alot but not working

lapis sequoia Jun 19, 2019, 10:09 AM

#

thanks bozo

desert oar Jun 19, 2019, 12:47 PM

#

man

#

both pandas and numpy docs

#

they really lack a "basic concepts" section

#

it all jumps right into technical stuff

#

the world needs a good coherent intro to this stuff

lyric canopy Jun 19, 2019, 12:47 PM

#

More basic than https://docs.scipy.org/doc/numpy/user/quickstart.html?

desert oar Jun 19, 2019, 12:48 PM

#

thats better than i remember

#

huh you know thats pretty good

#

it covers broadcasting rules

lyric canopy Jun 19, 2019, 12:48 PM

#

There's also one for pandas: https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html#min

desert oar Jun 19, 2019, 12:49 PM

#

10 minutes to pandas was the one i was thinking of

#

i remember when i started w/ pandas coming from R, i read 10 minutes to pandas and had absolutely no idea what was going on

#

selection by label? what?

#

the numpy one is serviceable because it starts by describing the data model

#

pandas expects you just figure out that theres this Index data structure, and that a Dataframe is logically a collection of Series, et al

lyric canopy Jun 19, 2019, 12:50 PM

#

I think the main problem in the pandas one is that they omit half of the example

#

Show the df with labels and then the result of a selection

desert oar Jun 19, 2019, 12:51 PM

#

ugh yes. they reuse the same example from the beginning and expect you to keep track

lyric canopy Jun 19, 2019, 12:51 PM

#

That way, you can see what's happening and learn the terminology

#

Yeah, it's one of the basic writing principles we were taught in our scientific writing course: Don't rely on your readers remembering things read on page 1 when they're reading page 7

desert oar Jun 19, 2019, 12:52 PM

#

good principle

wide gyro Jun 19, 2019, 1:44 PM

#

Hey guys, I want to make my clean function more accessible and wanted to know if drop.na(subset='COLUMNS') could be set to where it reads every column no matter what file you put through it

#

So if I have a csv of 15 columns, it'll sort through them, but then sort through another csv with 5

desert oar Jun 19, 2019, 1:46 PM

#

@wide gyro just leave off subset=

#

if you mean dropna()

wide gyro Jun 19, 2019, 1:47 PM

#

so just dropna and that's it

desert oar Jun 19, 2019, 1:47 PM

#

yes

wide gyro Jun 19, 2019, 1:47 PM

#

well with ()

desert oar Jun 19, 2019, 1:52 PM

#

but in general you can get all the columns from a dataframe with df.columns

bitter pewter Jun 19, 2019, 2:04 PM

#

https://www.youtube.com/watch?v=e60ItwlZTKM

YouTube

Joe James

Python: Pandas Tutorial | Intro to DataFrames

Tutorial on the basics of Python's data frames (spread sheet) library, Pandas in this tutorial. Intro to statistical data analysis and data science. RELATED ...

▶ Play video

#

Here's a good guide on Pandas dataframes

#

Ignore the "from Silicon Valley" crap marketing lol

bitter pewter Jun 19, 2019, 2:56 PM

#

Hey, guys! I'm trying to make the code to read 5 different pages content and gather it as a table in Pandas. How do I make the selenium to change the pages and read the content?

#

MWE:

#

from bs4 import BeautifulSoup
from openpyxl import Workbook
import numpy as np
import pandas as pd

url = "https://scon.stj.jus.br/SCON/legaplic/toc.jsp?materia=%27Lei+8.429%2F1992+%28Lei+DE+IMPROBIDADE+ADMINISTRATIVA%29%27.mat.&b=TEMA&p=true&t=&l=1&i=18&ordem=MAT,@NUM"

driver = webdriver.Firefox()
driver.implicitly_wait(30)
driver.get(url)

python_button = driver.find_element_by_xpath('/html/body/div[2]/div[6]/div/div/div[3]/div[2]/div/div/div/div[16]/a') # Aponta o dispositivo (art. 17, p. 6o)
python_button.click()

driver.switch_to.window(driver.window_handles[-1])

python_button = driver.find_element_by_xpath('/html/body/div[2]/div[6]/div[1]/div/div[3]/div[2]/div/div/div/div[3]/div[2]/span[2]/a') # Aponta os resultados da pesquisa
python_button.click()

driver.switch_to.window(driver.window_handles[-1])

textList = driver.find_elements_by_class_name("docTexto") # Variável que puxa os dados nos campos da lista de resultados

resultados = BeautifulSoup(driver.page_source, 'lxml')

parse = resultados.find('div', {'id':'listadocumentos'})
paragrafoBRS = parse.find_all('div',{'class':'paragrafoBRS'})

header = []
content = []
for each in paragrafoBRS:
    header.append(each.find('h4', {'class':'docTitulo'}).text.strip())
    content.append(each.find(['div','pre'], {'class':'docTexto'}).text.strip())

    df = pd.DataFrame([content], columns = header)

df.to_excel('dados.xlsx') # Exporta as informações para um arquivo .xlsx

driver.quit()

#

It already successfuly export the data to a .xlsx spreadsheet

wide gyro Jun 19, 2019, 8:31 PM

#

I'm trying to output a coordinates map but my lat and lon are switched and I think that's the reason for making my map behind it super small

#

as in my lat is on x axis and lon is y axis

#

I am setting geometry = [Point(xy) for xy in zip(df["lng"],df["lat"])]

#

Which I thought would do the trick, but the latitude is still x axis

#

Nevermind

#

I think I got it

desert oar Jun 20, 2019, 12:19 AM

#

it depends on how you defined your Point class

lapis sequoia Jun 20, 2019, 2:09 AM

#

Also check out namedtuples

strong flare Jun 20, 2019, 2:13 AM

#

df = pd.read_excel(r'data\mcd vs kfc eg.xlsx')
KFC = df['Restaurants'] == 'KFC'
KFC_Amount = df['Amount'] > 0
KFC_Data = df[KFC & KFC_Amount]

MCD = df['Restaurants'] == 'MCD'
MCD_Amount = df['Amount'] > 0
MCD_Data = df[KFC & KFC_Amount]

KFC_Data = KFC_Data.groupby(['Date']).sum()
MCD_Data = MCD_Data.groupby(['Date']).sum()```

HI Guys may i know how can i join this two data (KFC_Data,MCD_Data) in a table with specific columns name ?

#

is it i have convert this 2 data to new dataframe and join together ?

onyx granite Jun 20, 2019, 2:48 AM

#

you need a key to join on

#

or use the default index (if the keys actually match)

strong flare Jun 20, 2019, 3:03 AM

#

@onyx granite it can be date ? a column in the dataset

onyx granite Jun 20, 2019, 3:04 AM

#

there should be a way to specify an index to be a datetime

#

check the docs and you should be able to join based on that

lapis sequoia Jun 20, 2019, 3:06 AM

#

hi

#

i'm looking for a good plot

#

I have some groups names in a column and their correlation score in a column

#

what sort of plot would look good

strong flare Jun 20, 2019, 4:10 AM

#

import pandas as pd 
import matplotlib.pyplot as plt

df = pd.read_excel(r'Data\mcd vs kfc eg11.xlsx')

KFC = df['Restaurants'] == 'KFC'
KFC_Amount = df['Amount'] > 0
KFC_Data = df[KFC & KFC_Amount]

KFC_Data.rename(columns={
    'Amount': 'KFC_Sales',
    'Count': 'KFC_Count',
},
           inplace=True)
KFC_Data.set_index('Date')
KFC = KFC_Data.groupby('Date').sum()

MCD = df['Restaurants'] == 'MCD'
MCD_Amount = df['Amount'] > 0
MCD_Data = df[MCD & MCD_Amount]

MCD_Data.rename(columns={
    'Amount': 'MCD_Sales',
    'Count': 'MCD_Count',
},
           inplace=True)
MCD_Data.set_index('Date')
MCD = MCD_Data.groupby('Date').sum()

Combined = pd.concat([KFC, MCD], axis=1)

# Plot with differently-colored markers.
plt.plot(Combined.index, Combined.KFC_Sales, 'b-', label='KFC_Sales')
plt.plot(Combined.index, Combined.MCD_Sales, 'g-', label='MCD_Sales')
plt.plot(Combined.index, Combined.KFC_Count, 'r-', label='KFC_COUNT')
plt.plot(Combined.index, Combined.MCD_Count, 'y-', label='MCD_COUNT')

# Create legend.
plt.legend(loc='lower right')
plt.xlabel('Date')
plt.ylabel('KFC VS MCD')
plt.show()

Finally success yoj

#

but it still showout error

SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
To register the converters:
>>> from pandas.plotting import register_matplotlib_converters
>>> register_matplotlib_converters()
warnings.warn(msg, FutureWarning)

lapis sequoia Jun 20, 2019, 6:10 AM

#

that's just a warning

strong flare Jun 20, 2019, 6:41 AM

#

@lapis sequoia thanks i will just ignore

vestal lagoon Jun 20, 2019, 7:39 AM

#

Does anyone have any good suggestions for economic forecasting libraries on python

#

I’m using ARIMA for time series but I feel it’s lacking complexity for economics and I don’t know how to fix that lol

strong flare Jun 20, 2019, 8:02 AM

#

Hi all Pros, May i know how can i select month on date ?
like i need to select March on the date listed below, how can i get it ? the dtype is ('<M8[ns]')

2019-03-01
2019-04-01    
2019-05-01

polar acorn Jun 20, 2019, 9:40 AM

#

@vestal lagoon You can check out facebook's prophet library. It's a bit more advanced but still quite intuitive. Time series forecasting is hard though and simple models often perform as well as more complex models so be aware that there might not be a magic bullet here.

#

@strong flare You can find the month of a datetime variable date as follows date.month. If you have your dates as a column in a pandas data frame called df you can get a subset of the dataframe on March like this: df.loc[df['Dates'].month==3, ]

strong flare Jun 20, 2019, 9:52 AM

#

@polar acorn ok got it thank you

spark nimbus Jun 20, 2019, 11:06 AM

#

So I found this post with the exact same question I have, and the answer is very clear to me. However, I'm having a lot of issues implementing this, can anyone give me some directions on what I should and shouldn't do?
https://dsp.stackexchange.com/questions/24017/which-filter-for-an-audio-equalizer?rq=1

Signal Processing Stack Exchange

Which filter for an audio equalizer

I’m new to DSP and I have to do an audio equalizer in c++.
I did a lot of research about it and tried some stuff in the last month, but I’m a little overwhelmed with all those informations and it’...

#

also @summer plover could the admins discuss the possibility of opening up a channel specifically for audio/video/other signal processing?

summer plover Jun 20, 2019, 11:09 AM

#

that is something we can talk about. could you make the same request in #community-meta as well?

spark nimbus Jun 20, 2019, 11:09 AM

#

sure

summer plover Jun 20, 2019, 11:10 AM

#

❤

lean ledge Jun 20, 2019, 11:30 AM

#

@spark nimbus hmu with your signal processing questions

#

putting them mechatronic engineering skilsl to work

#

they have specific terms btw, not frequency domain vs time domain

spark nimbus Jun 20, 2019, 11:30 AM

#

okay so tl;dr I want to add an EQ for users to configure simply with sliders (similar to how V4A works if you've ever used it)

lean ledge Jun 20, 2019, 11:30 AM

#

IIR and FIR

#

IIR will probably be better

#

You can treat them as multiple bandpass butterworth filters for which you change the gain

spark nimbus Jun 20, 2019, 11:31 AM

#

I what

#

note that I have practically zero knowledge of such terms

#

I know IIR and FIR and thats it

lean ledge Jun 20, 2019, 11:32 AM

#

https://en.wikipedia.org/wiki/Band-pass_filter
https://en.wikipedia.org/wiki/Butterworth_filter

Band-pass filter

A band-pass filter, also bandpass filter or BPF, is a device that passes frequencies within a certain range and rejects (attenuates) frequencies outside that range.

Butterworth filter

The Butterworth filter is a type of signal processing filter designed to have a frequency response as flat as possible in the passband. It is also referred to as a maximally flat magnitude filter. It was first described in 1930 by the British engineer and physicist Stephen Bu...

spark nimbus Jun 20, 2019, 11:33 AM

#

Basically the user changes these sliders (though idk how to extract the cutoff values used here) and it gets applied to my signal

📎 Screenshot_20190620-133130.jpg

lean ledge Jun 20, 2019, 11:35 AM

#

Just make arbitrary cutoffs (take the entire 20-20khz range and split it up (possibly in log scale?)) and make a bandpass filter for each range

spark nimbus Jun 20, 2019, 11:35 AM

#

I couldn't find any code that does this at all sadly

#

any recommendation on how to do that?

#

also, how would I make a bandpass filter?

#

and do I pass it through all of them sequentially or in parallel and them sum them?

lean ledge Jun 20, 2019, 11:36 AM

#

Bandpass filter is essentially a low pass and a high pass together

#

And parallel and then sum

spark nimbus Jun 20, 2019, 11:36 AM

#

what's a low pass/high pass filter?

lean ledge Jun 20, 2019, 11:38 AM

#

https://github.com/twoz/pyEQ

GitHub

twoz/pyEQ

Contribute to twoz/pyEQ development by creating an account on GitHub.

#

parametric equalizer in python

#

should be able to pick up a few things

paper niche Jun 20, 2019, 11:39 AM

#

Anyone familiar with pyspark? i have a spark dataframe containing a column of arrays (A) and another column of structs (B). The array elements themselves are structs of the same structure as the dataframe struct column (B), what’s the correct way to append B to arrays in A, row-wise?

spark nimbus Jun 20, 2019, 11:39 AM

#

I checked that one but it didn't seem like it worked the same nor was the code very readably structured

#

and it also doesnt specify the width

lean ledge Jun 20, 2019, 11:42 AM

#

looking at the code, there's 5 filters and each is initialised with it's parameters already made

spark nimbus Jun 20, 2019, 11:42 AM

#

I also tried chekcing the pulseeffects source but it's also pretty difficult to understand their implementation

lean ledge Jun 20, 2019, 11:45 AM

#

https://github.com/twoz/pyEQ/blob/master/filters.py
defines all filter types, in particular as elliptic filters for the band-pass and high pass filters and butterworth filter for low pass. it uses scipy as a backend so it shouldnt be hard to figure out what each parameter is for with a little bit of documentation reading

GitHub

twoz/pyEQ

Contribute to twoz/pyEQ development by creating an account on GitHub.

spark nimbus Jun 20, 2019, 11:46 AM

#

so I wanted the bandpass filter right

lean ledge Jun 20, 2019, 11:46 AM

#

multiple band pass filters and then a low and high pass

#

for either end

spark nimbus Jun 20, 2019, 11:46 AM

#

is that the HPButter one?

lean ledge Jun 20, 2019, 11:46 AM

#

HPButter -> butterworth high pass

spark nimbus Jun 20, 2019, 11:46 AM

#

uhh

lean ledge Jun 20, 2019, 11:46 AM

#

https://en.wikipedia.org/wiki/Filter_(signal_processing)

Filter (signal processing)

In signal processing, a filter is a device or process that removes some unwanted components or features from a signal. Filtering is a class of signal processing, the defining feature of filters being the complete or partial suppression of some aspect of the signal. Most often...

spark nimbus Jun 20, 2019, 11:47 AM

#

and now in a way I understand? ^^'

lean ledge Jun 20, 2019, 11:47 AM

#

This gives basic background

spark nimbus Jun 20, 2019, 11:47 AM

#

I've been through this one before

#

doesn't mean I understood haolf of it >~>

lean ledge Jun 20, 2019, 11:49 AM

#

high pass blocks low frequencies (high ones pass), low pass blocks high frequencies (low ones pass), bandpass blocks both low and high except in a band that passes

#

butterworth, chebyshev and elliptic are types of filters

spark nimbus Jun 20, 2019, 11:49 AM

#

so to make a bandpass I just add up a low pass and band pass?

#

wont that make a giant spike tho

lean ledge Jun 20, 2019, 11:50 AM

#

giant spike?

#

📎 unknown.png

spark nimbus Jun 20, 2019, 11:51 AM

#

📎 unknown.png

lean ledge Jun 20, 2019, 11:51 AM

#

\omega is angular frequency, |G| the the magnitude of gain

spark nimbus Jun 20, 2019, 11:52 AM

#

this is adding up both

lean ledge Jun 20, 2019, 11:52 AM

#

Where'd you get those graphs from?

spark nimbus Jun 20, 2019, 11:52 AM

#

it's an approximation I made in desmos using just log_e(x) and e^x

#

both converging to the same Y

lean ledge Jun 20, 2019, 11:55 AM

#

it's not what filter responses look like generally
you dont add them up in the + sense of the word, you compose them. one after another. that's a convolution in the time domain, multiplication in s domain

spark nimbus Jun 20, 2019, 11:55 AM

#

a what

lean ledge Jun 20, 2019, 11:55 AM

#

you have control over the cutoff frequencies. you can shift them left and right

spark nimbus Jun 20, 2019, 11:55 AM

#

whats a time domain and s domain and convolution

lean ledge Jun 20, 2019, 11:57 AM

#

f(t) is your signal
f(\omega) is your signal in frequency domain (after fourier transform)
f(s) is your signal in complex frequency domain (after laplace transform)
convolution is mathematical operation involving shifting, multiplying and integrating two functions. according to the convolution domain, a convolution in one domain (time or frequency) is a multiplication in the other domain

#

Wish I could dumb it all down but it's taken me 2 semesters of classes in my engineering degree to get a hang of it all, it's not an easily approachable subject for a person without an electrical engineering background and maths is the fields' bread and butter

spark nimbus Jun 20, 2019, 12:02 PM

#

It's mostly just terminology I need to get used to

#

If I see code I practically understand complex concepts immediately

lean ledge Jun 20, 2019, 12:04 PM

#

BlobThinkingEyes Code doesnt help you learn a semester's worth of signal theory that underpins that code

#

My suggestion would be to just reuse someone else's code

spark nimbus Jun 20, 2019, 12:06 PM

#

But there's not even any code compatible with my project :(

lean ledge Jun 20, 2019, 12:11 PM

#

What's wrong with the pyEQ one?

#

it just uses scipy/numpy

spark nimbus Jun 20, 2019, 2:04 PM

#

The design is really hard to work with given my current framework

#

Just figuring out how to make it work without GUI is difficult enough already

#

Then I need to make it work from just a list of floats

silk forge Jun 20, 2019, 2:32 PM

#

📎 unknown.png

#

i need some help with this ^

#

i cant understand

desert oar Jun 20, 2019, 2:44 PM

#

Which part don't you understand