#data-science-and-ml | Python | Page 199

deft harbor May 29, 2019, 7:03 PM

#

It's in help 5

opaque vine May 30, 2019, 9:44 AM

#

Anyone with pandas experience and willing to help out?
I've got a huge dataset, looks somewhat like this - and I don't have any clue on where to begin. I'd like to do some analysis on for example what are the most commons procedurecodes on an invoice IF invoice has X procedure code, etcetera. Anyone? finger_gun_dank

📎 unknown.png

#

Got 0 experience from pandas, let alone doing analysis with python, I can write some very basic stuff, but I can't seem to find any reasonable tutorials for what I want to do, perhaps I can't type the question well enough

craggy geyser May 30, 2019, 9:53 AM

#

Hi! I have a quite big sqlite database that has a timestamp column with Unix UTC timestamps with seconds resolution. I am creating a plotly-dash web application where I am using pandas to read from the database to a dataframe. I want to keep the timestamp column, but I also want a column for datetime. This is quite easy and fast for me, I do it like so:

import pandas as pd

# [...]

df["Date"] = pd.to_datetime(df["timestamp"], unit="s", utc=True)
df.Date = df.Date.dt.tz_convert(tz)

Where I also conver the timezone to the local timezone. But I've noticed that the plots in my dash applications operate a lot faster if the datetime column is represented as strings instead. So I do the following:

df.Date = df.Date.dt.strftime("%Y-%m-%d %H:%M%S")

But this operation is very slow for a large dataframe. Is there a faster method? It seems that df.Date.astype("str") is slightly faster, but not by a large margin, and the end format is also on another form than I'd like. Any help with this would be great 😃

#

@opaque vine For a brief introduction, I would reccommend checking out https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python 😃

DataCamp Community

Pandas Tutorial: DataFrames in Python

Explore DataFrames in Python with this Pandas tutorial, from selecting, deleting or adding indices or columns to reshaping and formatting your data.

opaque vine May 30, 2019, 10:06 AM

#

@craggy geyser yeah i suppose I'll need to, i've been going through different tutorials and or projects, but i'm guessing the stuff that i'm looking to do and solve with python are more advanced than the stuff that I find from tutorials etc. well got to keep on pounding through i suppose

craggy geyser May 30, 2019, 10:07 AM

#

@opaque vine Sounds like you want to do some sort of apply https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html

#

or just filtering and then handling, like so:

df[df["Procedure code"] == "ALE01"]

but I don't know, it's not very clear to me exactly what you want to accomplish, and I'm not an expert either

opaque vine May 30, 2019, 10:10 AM

#

tbh I don't even know what to call all the stuff that i'm looking to do, I can do a lot of my work in excel and/or powerbi, but i'd love to learn python

#

yeah i'll keep on browsing, cheers

lapis sequoia May 30, 2019, 12:00 PM

#

how can i make it in a pandas dataframe that when i add a new value the oldest one is getting removed? like i want to have a dataframe of 10 values and when i add a new one, the oldest get removed?

supple ferry May 30, 2019, 12:11 PM

#

@lapis sequoia is it going to be a big one?
You can do like
df = df.tail(10) every time you add some row.
Sorry from mobile it is not easy to format the code

#

This is just a quick workaround

lapis sequoia May 30, 2019, 12:46 PM

#

what does that do @supple ferry

opaque vine May 30, 2019, 1:08 PM

#

gives you the bottom 10, so I guess whenever you add a new row it goes to the bottom, and by using df.tail(10) it shows you the last 10

lapis sequoia May 30, 2019, 1:09 PM

#

yeah but it should be removed so i don't end up with a huge dataframe

opaque vine May 30, 2019, 1:20 PM

#

well given that whenever you append(?) new stuff to the dataframe it goes to the bottom, you should be able to drop the first row after?

#

https://www.quora.com/How-should-I-delete-rows-from-a-DataFrame-in-Python-Pandas

How should I delete rows from a DataFrame in Python-Pandas? - Quora

Edit 27th Sept 2016: Added filtering using integer indexes There are 2 ways to remove rows in Python: 1. Removing rows by the row index 2. Removing rows that do not meet the desired criteria Here is the first 10 rows of the Iris dataset that will ...

#

im learning as we speak so take it with a grain of salt, but yeah you can always drop the first row like that

lapis sequoia May 30, 2019, 1:27 PM

#

oke cool thanks

olive willow May 30, 2019, 2:22 PM

#

guys need help, I've a quite large dataset about Pokémon's an am preforming analysis on it.

#

but I don't know how to create a function to get the needed result

#

import csv
import operator
from pprint import pprint
with open(r'E:\CODING\code_projects\[DATA]\pokemon.csv', newline='') as f:
    f.readline()
    Total = sum(int(row[4]) for row in csv.reader(f))
    arg_Total = Total / 721
print(Total)
print(arg_Total)

strongest = []
with open(r'E:\CODING\code_projects\[DATA]\pokemon.csv', newline='') as f:
    f.readline()
    for row in csv.reader(f):
        if int(row[4]) > 600:
            if len(strongest) <= 11:
                strongest.append([row[1], row[4], row[2]])
            else:
                pass
    pprint(sorted(strongest, key=operator.itemgetter(1), reverse=True))

#

this is the current code

#

and output:

#

301339
417.94590846047157
[['Arceus', '720', 'Normal'],
 ['Mewtwo', '680', 'Psychic'],
 ['Lugia', '680', 'Psychic'],
 ['Ho-Oh', '680', 'Fire'],
 ['Rayquaza', '680', 'Dragon'],
 ['Dialga', '680', 'Steel'],
 ['Palkia', '680', 'Water'],
 ['Giratina', '680', 'Ghost'],
 ['Slaking', '670', 'Normal'],
 ['Kyogre', '670', 'Water'],
 ['Groudon', '670', 'Ground'],
 ['Regigigas', '670', 'Normal']]

#

the func I need is, you see the [0,3] index in the output in the list. It's a name, 'Normal'.

#

and for the others also

#

I want to group every Pokémon which has that type

#

there are in total 721 Pokémon's and I want to know which type is on average the strongest/best

#

but first I need to group them and idk how

lapis sequoia May 30, 2019, 2:41 PM

#

seems like you're trying to sort by strongest?

#

read it as a dataframe

#

import pandas as pd

#

it'll be a lot easier

olive willow May 30, 2019, 2:42 PM

#

I'm confused

#

a guy told me that I shouldn't use pandas

lapis sequoia May 30, 2019, 2:42 PM

#

why

olive willow May 30, 2019, 2:43 PM

#

idk he told that this is better

lapis sequoia May 30, 2019, 2:43 PM

#

uhh

olive willow May 30, 2019, 2:43 PM

#

yhh uuuhhh

lapis sequoia May 30, 2019, 2:43 PM

#

no it's not.. what's your end goal

olive willow May 30, 2019, 2:43 PM

#

I've 3 questions to answer

#

-What are the top 10 pokemons?

#

-Which Pokémon type is the best

lapis sequoia May 30, 2019, 2:44 PM

#

df = pd.read_csv(file_name_here)

olive willow May 30, 2019, 2:44 PM

#

-what makes the strong Pokémon's different from the weak ones, and has it to do with their type?

lapis sequoia May 30, 2019, 2:45 PM

#

top 10 pokemon.. just sort the dataframe by the second column and limit to 10

olive willow May 30, 2019, 2:45 PM

#

yh

lapis sequoia May 30, 2019, 2:45 PM

#

for which pokemon type is best.. do group by and aggregate the second column and find the type with the highest aggregate

#

for the third question.. im not sure..

olive willow May 30, 2019, 2:46 PM

#

yh I know what to do, but not how, so lemme dive into this

lapis sequoia May 30, 2019, 2:46 PM

#

gl

olive willow May 30, 2019, 2:46 PM

#

thanks!

void anvil May 30, 2019, 7:16 PM

#

Quick question about train sets. Do they always end up with 100% accuracy?

              precision    recall  f1-score   support

        -1.0       1.00      1.00      1.00     16593
         1.0       1.00      1.00      1.00     17145

   micro avg       1.00      1.00      1.00     33738
   macro avg       1.00      1.00      1.00     33738
weighted avg       1.00      1.00      1.00     33738

#

Assuming you don't choke prematurely

olive willow May 30, 2019, 8:33 PM

#

mostly not, highest you could get is 99.9 but thats a fully trained

desert oar May 30, 2019, 10:39 PM

#

depends on the model and the problem

karmic geyser May 31, 2019, 2:47 AM

#

I'm not sure where this really fits, but I want to apply a bandpass filter to a continuous audio signal in python. Is there any good tutorials on implementing a bandpass filter in software? I tried using scipy.signal stuff but it doesn't seem to be working correctly, it seems to just be lowering the volume of everything rather then the frequency band I want..

warm orbit May 31, 2019, 2:53 AM

#

try https://plot.ly/python/fft-filters/#bandpass-filter or https://stackoverflow.com/questions/12093594/how-to-implement-band-pass-butterworth-filter-with-scipy-signal-butter ?

FFT Filters

Learn how filter out the frequencies of a signal by using low-pass, high-pass and band-pass FFT filtering.

Stack Overflow

How to implement band-pass Butterworth filter with Scipy.signal.butter

UPDATE:

I found a Scipy Recipe based in this question! So, for anyone interested, go straight to: Contents » Signal processing » Butterworth Bandpass
I'm having a hard time to achieve what seemed

karmic geyser May 31, 2019, 2:54 AM

#

Yeah I used the 2 functions at the top. but it doesn't seem to be working.

#

of the first link you said before you removed comment

warm orbit May 31, 2019, 2:54 AM

#

yeah i realized that was the same thing you said you tried already

karmic geyser May 31, 2019, 2:57 AM

#

Yeah, I will try messing around with the stuff in those links again. pretty much I'm trying to get a stereo input split it into 5 channels, 2 being mid range, 2 being high frequency, and then 1 channel being a combined signal with a low pass filter for a subwoofer, then outputting it through the soundcard to speaker amplifiers.

#

I have done it all but the actual lowpass + highpass + bandpass part.

supple ferry May 31, 2019, 9:28 AM

#

I have a dataset like this. Index ranges from 0 to 406907.

   individual  choice  pred_full  pred_base
0     9710535       0   0.002726   0.001284
1     9710535       0   0.003087   0.001897
2     9710535       0   0.002884   0.001778
3     9710535       0   0.005785   0.004427
4     9710535       0   0.004033   0.002241
5     9710535       0   0.003827   0.002918
6     9710535       0   0.003576   0.002734
7     9710535       0   0.060620   0.042998
8     9710535       0   0.032249   0.022193
9     9710535       0   0.002046   0.001186

I want to group this dataset by individual, but also have a second level index which ranges from 0 to the size of that group. How this can be done in Pandas??

individual   number choice  pred_full  pred_base
9710535            0   0   0.002726   0.001284
                              1    0   0.003087   0.001897
                              2   0   0.002884   0.001778
                              3   0   0.005785   0.004427
                              4   0   0.004033   0.002241

desert oar May 31, 2019, 11:36 AM

#

@supple ferry you can use .groupby(level=...) to group using an index instead of a column

supple ferry May 31, 2019, 12:01 PM

#

@desert oar can I use both normal column and and index

#

?

desert oar May 31, 2019, 12:04 PM

#

That's a good question

#

Probably not, but you can try it

#

If you need to turn an index into a column use .reset_index

#

http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reset_index.html#pandas.DataFrame.reset_index

supple ferry May 31, 2019, 12:15 PM

#

if i reset the index, it will restore the grouped column

desert oar May 31, 2019, 1:17 PM

#

@supple ferry which columns do you want to group on?

#

im confused

#

i thought they were both index columns

void anvil May 31, 2019, 1:21 PM

#

@supple ferry df.sort_values([('Group1', 'Group2')], ascending=False)

#

index = pd.MultiIndex.from_tuples(tuples, names=['Group1', 'Group2'])

#

pd.MultiIndex.from_product(iterables, names=['Group1, 'Group2']

#

pd.MultiIndex.from_frame(df)

#

depends on how you want to set it up

desert oar May 31, 2019, 1:44 PM

#

...i wouldnt do that

#

df = pd.DataFrame({
    'individual': ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c'],
    'x': np.arange(12) + np.arange(12)/12,
    'y': [-1.0, -1.0, -1.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 6.0]
}).set_index('individual')

df.set_index('y', append=True) \
    .groupby(level=['individual', 'y']) \
    .agg({'x': np.mean})

#

@supple ferry ^ maybe that helps

supple ferry May 31, 2019, 2:10 PM

#

@desert oar , thank you for the suggestion!
It does not quite produce what I intend to. What I want to have, also another column besides y which is just sows the indexes of y values per group, so, in this case, it will just repeate 0, 1 for every group:
this is the output I got with your help:

                         x
individual y
a          -1.0   1.083333
            4.0   3.250000
b           4.0   5.416667
            5.0   7.583333
c           5.0   9.750000
            6.0  11.916667

desert oar May 31, 2019, 2:11 PM

#

I guess I'm still not totally clear on what your data looks like then

#

You posted an example but it looks like there is more to your actual data than what you posted. Unless I misunderstand the example

supple ferry May 31, 2019, 2:11 PM

#

I will come up with an example now

desert oar May 31, 2019, 2:12 PM

#

Thanks, not trying to be obtuse or anything, it's just sometimes difficult to describe these things in words

supple ferry May 31, 2019, 2:17 PM

#

   individual  choice      pred
0           a       1  0.246645
1           a       1  0.530894
2           a       0  0.751739
3           a       1  0.902380
4           a       1  0.096860
5           a       1  0.153920
6           b       1  0.653829
7           b       0  0.349955
8           b       0  0.407649
9           b       0  0.402111
10          c       0  0.532963
11          c       1  0.263130
12          d       0  0.564971
13          d       0  0.226155
14          d       1  0.090390
15          d       1  0.682873
16          d       0  0.078723
17          d       1  0.963183
18          d       1  0.068704

#

@desert oar , this is basically how my data looks like

#

I want the output to have the column individual as its index, but also second index which is like the index I have now, but starting from zero for every group

#

group - > every unique individual

desert oar May 31, 2019, 2:27 PM

#

oh thats easier

#

def process_group(grp):
    grp = grp.reset_index(drop=True)
    grp.index.name = 'id_in_group'
    grp = grp.drop('individual', axis=1)  # drop 'individual' since this will be added as an index by the .groupby() operation
    return grp

mydata.groupby('individual').apply(process_group)

#

@supple ferry ok fixed up. try that ^

supple ferry May 31, 2019, 2:34 PM

#

@desert oar , this is exactly what i wanted!

#

thank you!

#

can you explain me a bit your approach too ?

#

because I may do some similar but not identical things with my df

#

because for me it was difficult to build an approach to the problem

desert oar May 31, 2019, 2:36 PM

#

yeah thats fair. pandas is a very big library

#

mydata.groupby() produces a sequence of dataframes, right?

#

(technically it produces a DataFrameGroupBy object but you dont need to care much about that)

#

so mydata.groupby(...).apply() accepts 1 argument, a callable. and that callable itself must accept 1 argument, which is the data frame corresponding to one group

#

so whatever you put in .apply() gets applied to each group one at a time

#

then they get concatenated back together

supple ferry May 31, 2019, 2:38 PM

#

split-apply-combine

desert oar May 31, 2019, 2:38 PM

#

precisely

#

so within each group, im doing this:

reset index so that it's sequential within the group starting from 0 (this is kind of a trick, ill elaborate after)
i set the index name just so it looks nice in printouts and can be easier to manage and keep track of
i delete the "individual" column from the grouped data, because i know that the groupby operation by default adds the grouping columns as an index

supple ferry May 31, 2019, 2:41 PM

#

aply works with groupby objects exactly like it works with rows, yes.yes, the step 1 is crucial. i was thinking about setting multi index, but failed to do it with column and array

desert oar May 31, 2019, 2:41 PM

#

the reset_index trick works like this:

the default index for a dataframe is sequential starting at 0
inside .groupby, the original dataframe indexes are preserved
so if you just delete the original index with.reset_index(drop=True), no index remains, so the default 0,1,... index is added; this is preserved in the result of the groupby.apply operation

#

cool

supple ferry May 31, 2019, 2:42 PM

#

wow

#

nice

#

this is what I call 100% complete help

#

not only shows how, but also shows why

desert oar May 31, 2019, 3:20 PM

#

i try 😃

#

pandas docs can be very dense. so i understand why these things are non-obvious

woeful ether May 31, 2019, 3:55 PM

#

test = pd.Series([0,1,"test",3])
test.append(pd.Series([4,5,6]))

Why does this not add to the Series?

supple ferry May 31, 2019, 4:05 PM

#

@woeful ether , it works for me

In [29]: test = pd.Series([0,1,"test",3])^M
    ...: test.append(pd.Series([4,5,6]))
Out[29]:
0       0
1       1
2    test
3       3
0       4
1       5
2       6
dtype: object

woeful ether May 31, 2019, 4:06 PM

#

wtf..

#

spent hours trying to add something to a series

#

and it doesnt work no matter what I do

#

how can it work for you and not me?

#

rage

supple ferry May 31, 2019, 4:19 PM

#

which version of pandas, python you have

#

?

woeful ether May 31, 2019, 4:20 PM

#

ok nvm im an idiot

#

but latest

daring spindle May 31, 2019, 5:57 PM

#

Are the keggle micro courses a good idea for starting?

#

After that doing some tournaments.

#

To practice.

olive willow May 31, 2019, 6:06 PM

#

sure every kind of practice is good

desert oar May 31, 2019, 6:58 PM

#

what's a micro course?

lapis sequoia May 31, 2019, 7:02 PM

#

You guys are mostly discussing ML algorithms right, and how to implement them....?

desert oar May 31, 2019, 7:02 PM

#

not necessarily, but that's on-topic here

lapis sequoia May 31, 2019, 7:04 PM

#

I am just starting with machine learning, and I cannot understand shit what most of the doubts or messages are here..
People talking about topics I never heard of...

#

How long will it take to be familiar with all these topics?

dense rose May 31, 2019, 7:12 PM

#

You'll never be familiar with all of them tbh

desert oar May 31, 2019, 8:19 PM

#

it takes a long time

#

better to focus on one or two things to start

#

and expand from there

#

a lot of learning new topics involves learning the math

#

so if you are good with math and know the basics, its easier to learn new things

olive willow May 31, 2019, 9:03 PM

#

just a question I'm 14 and am going really fast into data science. but how the hell am I supposed to learn calc and gradient descent for example. For online courses? or ask my teacher

#

because I will have to learn it if I continue on my current pace of learning in like 1 to 2 years

#

to learn ML and neural networks

#

I'm extremely good in math but the vocab is the main problem and that I've to learn it on my own

#

any tips?

void anvil May 31, 2019, 9:12 PM

#

@olive willow you're better off just treating it as a black box for now. Learn vocabulary related to the models and their use and ignore the math for a bit.

olive willow May 31, 2019, 9:12 PM

#

sure yh

#

btw can you give my an example of a vector in code

#

?

void anvil May 31, 2019, 9:12 PM

#

[0,1,2,3]

olive willow May 31, 2019, 9:13 PM

#

just a list

void anvil May 31, 2019, 9:13 PM

#

a vector is a quantity with a magnitude and direction

olive willow May 31, 2019, 9:14 PM

#

how would you write a 2d vector, just two list inside one?

lean ledge May 31, 2019, 9:27 PM

#

@olive willow you can learn all the fundamental maths through Khan academy and MIT OCW

olive willow May 31, 2019, 9:27 PM

#

thanks dude!

#

will check it out tomorrow !

desert oar May 31, 2019, 9:47 PM

#

"take your time" @olive willow 😃

olive willow May 31, 2019, 10:12 PM

#

life is short bro

#

😃

mossy dragon Jun 1, 2019, 1:26 AM

#

Do excercises

#

tons and tons of excercises

stoic beacon Jun 1, 2019, 1:43 AM

#

I'm stuck. I'm starting to get into ML and I have a work related dataset that I'm trying to do a basic linear regression on. My plot won't work because x and y aren't the same size because one is 2d and one is 1d. This makes sense. But then I'm not sure how best to plot my data. I'm trying to predict the number of cases by the day of the week and date (or days since the start date - I had to use this to make the date into some numeric value). My code is here (sorry for no actual code). If you can provide me with any information that will help like what graph I should use or any other helpful info I'd really appreciate it. My brain is dead from trying to grasp all of this

https://GitHub.com/tyr4el/caseslinearregressionml

GitHub

Tyr4el/CasesLinearRegressionML

Contribute to Tyr4el/CasesLinearRegressionML development by creating an account on GitHub.

#

If you do help, can you just @ me please? I'm away from my PC and won't be able to readily read what you post and it may be hours later or tmrw

mossy dragon Jun 1, 2019, 4:49 AM

#

@stoic beacon So you are saying your regression won't work because you have two explanatory variables? That shouldn't be a problem if you are doing multiple linear regression.

#

I'm also not sure why your scaling the data; It doesn't seem necessary to me in this case. (Although I'm not familiar with regression in python so someone else please chime in if im wrong!)

olive willow Jun 1, 2019, 7:59 AM

#

guys do you need Quadratic algebra for data science?

mossy dragon Jun 1, 2019, 8:36 AM

#

wat

olive willow Jun 1, 2019, 8:36 AM

#

this, one sec

mossy dragon Jun 1, 2019, 8:37 AM

#

what job do you want though

#

like what do you want to do specifically?

olive willow Jun 1, 2019, 8:37 AM

#

data scientist

#

📎 graph10.png

#

this

mossy dragon Jun 1, 2019, 8:38 AM

#

data scientist is a really big umbrella term

#

you need to find out specifically what you want to do so you can figure out what you need to learn

foggy sky Jun 1, 2019, 8:38 AM

#

You guys have worked with Kalman Filters?

olive willow Jun 1, 2019, 8:39 AM

#

📎 17IMev5xslc9FLxr9hHhpFw.png

#

mostly analysis and ML/deep

#

so not the last two

#

so mostly data science analytics

#

but what's the diffrence between a data analyst and data scientist

#

there're not the same

foggy sky Jun 1, 2019, 8:42 AM

#

So, you have work with kalman filters?😂

olive willow Jun 1, 2019, 8:42 AM

#

me no dude sry

foggy sky Jun 1, 2019, 8:43 AM

#

Well kalman filters it's a statistical method that works well when sensors fail... reduce noises basically

olive willow Jun 1, 2019, 8:44 AM

#

so it compares the new sensor log and if it's way different than the previous ones, the new log gets adjusted ?

#

something like that

foggy sky Jun 1, 2019, 8:46 AM

#

Something like that... but its predict online and offline... so it will work even if the sensor are not working for a small period of time

olive willow Jun 1, 2019, 8:46 AM

#

oohhh that's good

#

it must be very useful

foggy sky Jun 1, 2019, 8:47 AM

#

Yeah, but I'm trying to predict bus arrival time with it using a library on python called pykalman...

olive willow Jun 1, 2019, 8:48 AM

#

so you have a training set right?

#

of previous times

foggy sky Jun 1, 2019, 8:48 AM

#

Yeah! But kalman doesn't work like ML mefhods

#

Methods**

olive willow Jun 1, 2019, 8:49 AM

#

ooohhh so you have to do the entire algorithm by yourself?

#

or what?

foggy sky Jun 1, 2019, 8:50 AM

#

No, I can import pykalman... it have the functions that I need

olive willow Jun 1, 2019, 8:50 AM

#

oohh sure, ahhaha

foggy sky Jun 1, 2019, 8:50 AM

#

But I really don't understand how to use it 😂

olive willow Jun 1, 2019, 8:50 AM

#

isn't it better to create a ML program to predict the upcoming time?
\

mossy dragon Jun 1, 2019, 8:50 AM

#

what are you trying to use it for

olive willow Jun 1, 2019, 8:51 AM

#

he wants to predict a bus arrival time using pykalman

foggy sky Jun 1, 2019, 8:52 AM

#

I'm using big data

olive willow Jun 1, 2019, 8:52 AM

#

which format? csv

#

?

foggy sky Jun 1, 2019, 8:52 AM

#

So... have some GB of data XD

olive willow Jun 1, 2019, 8:52 AM

#

yh hahahah

foggy sky Jun 1, 2019, 8:52 AM

#

Yeah, csv

olive willow Jun 1, 2019, 8:52 AM

#

I bet from Kaggel

#

it would be easier using ML I think but aren't there any yt tutorials about pykalman?

mossy dragon Jun 1, 2019, 8:59 AM

#

figure out and put it in your blog

#

rake in the $$$

foggy sky Jun 1, 2019, 9:04 AM

#

XD

lean ledge Jun 1, 2019, 9:22 AM

#

@foggy sky I've worked with Kalman filters

#

What about em?

#

Kalman filters are about state estimation for processes with linear dynamics with Gaussian noise. You'll have to talk me what the state in your bus system would be and how you plan on figuring out system dynamics

foggy sky Jun 1, 2019, 9:28 AM

#

I already have the data ready

#

I already have the data ready

#

But I don't know how to use the function in pykalman

#

@@lean ledge

#

What do you use to create the kalman filter?

#

I really don't know to much about it... that's why I'm asking for some help😂

mossy dragon Jun 1, 2019, 9:41 AM

#

hey raggy

#

weren't you also in another server with me?

lean ledge Jun 1, 2019, 9:43 AM

#

@mossy dragon Yes. I am on a lot of servers

#

@foggy sky to create the Kalman filter, you need a model of the system dynamics

mossy dragon Jun 1, 2019, 9:44 AM

#

i thought you were in the data science or the statistics server but i guess not

#

wierd im 100% sure i met you somewhere else

lean ledge Jun 1, 2019, 9:45 AM

#

I left the data science server

#

I didn't like some of the people there

mossy dragon Jun 1, 2019, 9:45 AM

#

O

foggy sky Jun 1, 2019, 9:46 AM

#

@lean ledge but, what library or in what language did you create it?

lean ledge Jun 1, 2019, 9:47 AM

#

It doesn't really matter, all of them will do the same thing mathematically

foggy sky Jun 1, 2019, 9:48 AM

#

😂 right now I'm looking for the easiest one XD

lean ledge Jun 1, 2019, 9:48 AM

#

They'll all basically require the same things

#

4 or so matrices

#

With some options here and there

foggy sky Jun 1, 2019, 9:49 AM

#

I got the measurements matrix

#

But how do I get the other ones?

#

I have to create them for my one?? Or what?

#

Own**

lean ledge Jun 1, 2019, 9:54 AM

#

@foggy sky that's why I keep saying you need a model of the system. You need to either calculate (not possible here) or learn the parameters for the state transition matrix

#

Given you don't have any control over the bus, you can leave the control input matrix a zero matrix

foggy sky Jun 1, 2019, 9:56 AM

#

Oh...

#

Well, that's useful... XD

lean ledge Jun 1, 2019, 10:00 AM

#

Might just be able to sort of guess an approximate state transition matrix based on data

foggy sky Jun 1, 2019, 10:02 AM

#

Where I can more about it?

lean ledge Jun 1, 2019, 10:02 AM

#

I have a feeling this is an XY problem. Why are you trying to use a Kalman filter?

#

Kalman filters (along with things like particle filters) are for dynamical systems

#

And it's for state estimation under noise

#

I don't think that's quite what you're using it for

foggy sky Jun 1, 2019, 10:05 AM

#

This has been done before... using prediction models with kalman filters will give you better results...

lean ledge Jun 1, 2019, 10:05 AM

#

You can't use anything with anything

#

As I said

#

Kalman filters are for linear dynamical systems undergoing Gaussian noise

#

It gives better results on things that match that description

#

You can't expect to jam in any model anywhere. You may be able to apply another Bayesian filter model on your problem but not Kalman filtering unless it fits that description

foggy sky Jun 1, 2019, 10:09 AM

#

Ok... Thanks bro...

olive willow Jun 1, 2019, 10:58 AM

#

guys for what do you need linear algebra in programming

#

I mean the vectors

lapis sequoia Jun 1, 2019, 11:28 AM

#

well young padawan.. for that we need to go back and understand what vectors are..

#

https://towardsdatascience.com/a-practical-look-at-vectors-and-your-data-95bde21b37d1

Towards Data Science

A Practical Look at Vectors and Your Data

I remember a feeling of utter confusion when I first learned vector spaces in my first course of Linear Algebra. What’s a space? And what…

olive willow Jun 1, 2019, 11:33 AM

#

I know what it is my obiwan... it's a place far far far from home with only one set of coordinates corresponding to it

#

you can add them together if you would want to do it

#

@lapis sequoia

lapis sequoia Jun 1, 2019, 11:34 AM

#

gimme a sec.. im on a call

#

brb

olive willow Jun 1, 2019, 11:34 AM

#

sure np

#

it has a purpose in life, a direction and a length

#

but how do we represent a vector in programming, just a list?? what is it good for?

#

what applications in programming and data science does it have

#

All the elements are associative, commutative, and scalars are distributive with respect to element addition

#

what does this mean?

#

and:

#

There’s an element in the set such that adding it to any other element doesn’t change its value

#

can you gimme an example of this?

#

and of this There’s some number (called a scalar) such that multiplying it by any other element doesn’t change the element’s value

lapis sequoia Jun 1, 2019, 11:52 AM

#

https://image01.ipracticemath.com/content/imageslm/algebra/summary-cummutative-associative-law.png

olive willow Jun 1, 2019, 11:52 AM

#

what's that?

#

that algebra ?? right

#

or it looks like it

lapis sequoia Jun 1, 2019, 11:55 AM

#

these illustrations will help you understand these properties better

#

http://chortle.ccsu.edu/vectorlessons/vectorIndex.html#03

Vector Math Tutorial for 3D Computer Graphics

Tutorial on vector algebra for 3D computer graphics. Highly interactive.

stoic beacon Jun 1, 2019, 11:56 AM

#

@mossy dragon I wasn't doing multiple linear...lol. Where can I find the docs on multiple linear in sklearn? And yeah I won't scale it. I wasn't sure either

#

But I did think I needed to because one of my variables does get pretty large in comparison to the others

mossy dragon Jun 1, 2019, 11:57 AM

#

if your using more than one explanatory variable you want to use multiple linear regression

stoic beacon Jun 1, 2019, 11:57 AM

#

In the thousands while the others are 0-4 and in the hundreds

mossy dragon Jun 1, 2019, 11:57 AM

#

you should look at the distribution first I think

#

if its super skewed it might be a problem

stoic beacon Jun 1, 2019, 11:58 AM

#

It's not

#

I plotted it when my days since start was a datetime

olive willow Jun 1, 2019, 11:59 AM

#

thanks @lapis sequoia

stoic beacon Jun 1, 2019, 11:59 AM

#

Against the number of cases

#

Looked like this

📎 output_1.png

#

Can sklearn do multiple linear regression?

#

I can't seem to find anything on their site

#

Ah statsmodels seems to do it

mossy dragon Jun 1, 2019, 12:02 PM

#

yea no i dont think your gonna get good results if thats what your data looks like

stoic beacon Jun 1, 2019, 12:04 PM

#

I was advised to do linear first to just get a baseline

#

Then go from there

#

But if you suggest something different then that's fine

lapis sequoia Jun 1, 2019, 12:06 PM

#

looks like time series..

#

what are you trying to do

stoic beacon Jun 1, 2019, 12:07 PM

#

Predict the number of cases per day (Monday to Friday)

#

I was told time series would work but I had a hard time installing prophet and idk if I'd be able to install it at work

lapis sequoia Jun 1, 2019, 12:10 PM

#

number of what cases/

#

and how long does the data go back

stoic beacon Jun 1, 2019, 12:10 PM

#

Just number of cases

#

Cases = tickets

#

I'm in tech support

#

Uhhh 2012

#

Though you can see that around that time there weren't many cases per day

#

0-4 maybe

lapis sequoia Jun 1, 2019, 12:12 PM

#

well you can't use that then

stoic beacon Jun 1, 2019, 12:12 PM

#

Why

lapis sequoia Jun 1, 2019, 12:12 PM

#

it's not relevant.. there's no related pattern

stoic beacon Jun 1, 2019, 12:12 PM

#

Gatcha. I can drop those years

#

No biggie

lapis sequoia Jun 1, 2019, 12:14 PM

#

you can use https://colab.research.google.com

Google Colaboratory

#

for prophet

#

it's free.. and prophet comes installed

#

the data isn't that big, so this should do.. with the free runtime

stoic beacon Jun 1, 2019, 12:15 PM

#

Gatcha

#

Is time series the only thing I could use in this case?

#

The only model that would work

#

I wonder what other data I could get that I can use ML on lol

#

From work

#

I have access to a lot of reports lol

lapis sequoia Jun 1, 2019, 12:17 PM

#

depends what type of data it is..

#

this is historical data.. so time series for projections.. yes

stoic beacon Jun 1, 2019, 12:18 PM

#

Alrighty. I'll give that a shot for this

#

Is time series considered ML or statistical analysis?

lapis sequoia Jun 1, 2019, 12:21 PM

#

ML is statistics..

stoic beacon Jun 1, 2019, 12:21 PM

#

This is true

#

Lol

lapis sequoia Jun 1, 2019, 12:21 PM

#

time series is more stats related.. but you can just say time series forecasting

stoic beacon Jun 1, 2019, 12:22 PM

#

Fair enough. I'll try using that colab thing then

#

That should work

#

And I'll think of other things I could use ML for

olive willow Jun 1, 2019, 12:26 PM

#

guys so if were talking about a 'real coordinate space' it has to be inside a tuple in python

#

it is a d2 array, the R2

stoic beacon Jun 1, 2019, 12:27 PM

#

Ignore me

olive willow Jun 1, 2019, 12:27 PM

#

sure haha

stoic beacon Jun 1, 2019, 12:27 PM

#

A tuple can contain as many things as you wsnt

olive willow Jun 1, 2019, 12:28 PM

#

I know dude

#

I'm quite familiar with my datatypes

stoic beacon Jun 1, 2019, 12:28 PM

#

I wasn't done with my thought lol. I was driving

olive willow Jun 1, 2019, 12:28 PM

#

sure

mossy dragon Jun 1, 2019, 12:28 PM

#

I think you should learn more stats

#

before u try modeling stuff

lapis sequoia Jun 1, 2019, 12:28 PM

#

I think you shouldn't be typing while driving..

stoic beacon Jun 1, 2019, 12:28 PM

#

Prob

olive willow Jun 1, 2019, 12:29 PM

#

guys so if were talking about a '2 dimensional real coordinate space' it has to be inside a tuple in python like this ([3,4], [4,3])

stoic beacon Jun 1, 2019, 12:33 PM

#

Anyway @lapis sequoia @mossy dragon thanks for the help. Im probably a little rusty on stats. Only had a basic class in college

#

Don't have much free time nowadays but I'll do what I can

lapis sequoia Jun 1, 2019, 12:33 PM

#

it doesn't take long to brush up.. I think there's free courses on udacity

#

with illustrations

#

https://www.freecodecamp.org/news/if-you-want-to-learn-data-science-take-a-few-of-these-statistics-classes-9bbabab098b9/

freeCodeCamp.org News

If you want to learn Data Science, take a few of these statistics ...

by David Venturi

If you want to learn Data Science, take a few of these statistics classes
Image credit [http://www.123rf.com/profile_pixelsaway]A year ago, I was a
numbers geek with no coding background. After trying an online programming
course, I was so inspired that I en...

stoic beacon Jun 1, 2019, 12:34 PM

#

Things like mean, STD dev, etc are still mostly fresh

lapis sequoia Jun 1, 2019, 12:36 PM

#

yep that's where you start..

#

then there's statistical tests and stuff..

#

but there's a diagram ... wait

#

Stats tests

📎 59707162_10216517391280144_807067472095084544_n.jpg

stoic beacon Jun 1, 2019, 12:38 PM

#

Yeah I'm rusty on the basic ones

olive willow Jun 1, 2019, 1:01 PM

#

guys is this a 1d matrix?

#

           [1,1,1],
           [1,1,1]
        ])

proven crater Jun 1, 2019, 1:04 PM

#

I think the [1, 1, 1] on its own is 1D

earnest prawn Jun 1, 2019, 1:04 PM

#

a 1d matreix would be a vector and this is clearly 2D as its a list of lists

#

so it is just a normal matrix

olive willow Jun 1, 2019, 1:05 PM

#

ooohh yh sure but isn't a vector also 2d??

#

like [4,2]

#

it has a place on the x and the y asis

#

or am I seeing it wrong

earnest prawn Jun 1, 2019, 1:05 PM

#

thats just a vector which happens to contain two values

#

it is still one dimensional

olive willow Jun 1, 2019, 1:06 PM

#

but this would be two: ``` [[4,2],[7,5]]

earnest prawn Jun 1, 2019, 1:06 PM

#

the dimensions when talking about an n dimensional matrices refer to the list in list count not how many values the list contain

#

so a list in a list in a list is 3D

#

a list in a list is 2D

#

and a list is 1D (aka a vector)

olive willow Jun 1, 2019, 1:07 PM

#

oohh sure thanks now I understand it

#

I was looking at it form a math perspective

#

IRL

earnest prawn Jun 1, 2019, 1:07 PM

#

from a math perspective a matrix is still 2D and a vector is 1D

#

but a vector can indicate a point or something else in a 3D space if it has three values

olive willow Jun 1, 2019, 1:08 PM

#

but if it has two values, the vector is still 1d

earnest prawn Jun 1, 2019, 1:08 PM

#

yes

olive willow Jun 1, 2019, 1:08 PM

#

but the values in the vector are 2d or 3d

earnest prawn Jun 1, 2019, 1:09 PM

#

that could be a way to express it but dont nail me down on how exactly that is defined

olive willow Jun 1, 2019, 1:09 PM

#

sure hahaha

#

what are vector and matrixes used for in programming and ML

#

?

earnest prawn Jun 1, 2019, 1:10 PM

#

well vectors can represent lots of things like for example velocities or forces in games or physics simulations etc

olive willow Jun 1, 2019, 1:10 PM

#

they're just datatypes?

earnest prawn Jun 1, 2019, 1:10 PM

#

and matrices...well they have a million use cases in almost every area

olive willow Jun 1, 2019, 1:10 PM

#

so they're just datatypes used to store other datatypes

#

like lists

#

and you can make a matrix multidimensional

earnest prawn Jun 1, 2019, 1:11 PM

#

no a matrix is by definition a list of lists

#

matrices are per definition 2D

#

not less not more

#

yes but thats not a matrix anymore

olive willow Jun 1, 2019, 1:12 PM

#

what's is it called then?

earnest prawn Jun 1, 2019, 1:12 PM

#

thatd be a tensor

olive willow Jun 1, 2019, 1:12 PM

#

and a tensor is a 3 or more dimensional datatype

earnest prawn Jun 1, 2019, 1:13 PM

#

a tensor can be 1-n d

olive willow Jun 1, 2019, 1:13 PM

#

oohh

#

but what's the difference between a matrix and a 2d tensor

earnest prawn Jun 1, 2019, 1:14 PM

#

i am not really into tensors but Id argue its just a special case

olive willow Jun 1, 2019, 1:14 PM

#

sure

#

but if I would make a 3d graph I would use a tensor to store the x,y and z asis values

#

and a 2d a matrix that stores the x and y asis values

earnest prawn Jun 1, 2019, 1:15 PM

#

you can just use a matrix for 3D graphs

#

matrix[x][y] = z

#

well thatd be a pretty bad way as you could only have natural numbers for x and y

#

so if youd want it more precise yes youd have to use something with more dimensions

olive willow Jun 1, 2019, 1:16 PM

#

ok thanks @earnest prawn for the help!

#

and a pandas dataframe, has no dimensions right>?

#

and a vector can only have real numbers not even vars

#

?

earnest prawn Jun 1, 2019, 1:18 PM

#

so for the pandas part i dont know really I never did pandas

and I definitely can have variables inside my vectors

olive willow Jun 1, 2019, 1:19 PM

#

oohh sure, but the vars have to have a real number assigned to them>?

#

or not

earnest prawn Jun 1, 2019, 1:19 PM

#

if you are trying to exclude the possibility that there is a complex number inside a vector I cant answer that question with any certainty because Ive never looked into complex numbers

#

but my first intuition would be that a complex number inside a vector would be fine

olive willow Jun 1, 2019, 1:20 PM

#

dude I'm 14, I'm just trying to understand what you can use a vector for and also a matrix, tensor. and what can they store

earnest prawn Jun 1, 2019, 1:21 PM

#

ohhh

olive willow Jun 1, 2019, 1:21 PM

#

and how they are used inside ML

#

from the cs perspective

earnest prawn Jun 1, 2019, 1:21 PM

#

well you can use vectors matrices and tensors to store any numeric data you want

olive willow Jun 1, 2019, 1:21 PM

#

and numeric data is data which has real numbers?

#

and numpy is a lib for that

earnest prawn Jun 1, 2019, 1:24 PM

#

well it can also have complex numbers I think

#

and yes numpy is a lib for n dimensional arrays and their manipulation

olive willow Jun 1, 2019, 1:26 PM

#

real numbers are 3 4 6

#

and complex are ? like vars

earnest prawn Jun 1, 2019, 1:26 PM

#

complex numbers are something you dont have to understand

olive willow Jun 1, 2019, 1:27 PM

#

but can I have an example, just to know that it's a complex number

#

to know that it's one when I see one

earnest prawn Jun 1, 2019, 1:27 PM

#

well for example

10 + 2 * sqrt(-1)

#

or commonly expressed as 10 + 2 * i

olive willow Jun 1, 2019, 1:28 PM

#

so kinda algebra

#

but without the =

earnest prawn Jun 1, 2019, 1:28 PM

#

the relevant part about complex numbers is that we have sqrt(-1)

olive willow Jun 1, 2019, 1:29 PM

#

exactly that or also a different one

earnest prawn Jun 1, 2019, 1:29 PM

#

exactly taht

#

you cant calculate the sqrt(-1) can you?

olive willow Jun 1, 2019, 1:29 PM

#

yh of course

#

it's the num times the num

earnest prawn Jun 1, 2019, 1:30 PM

#

anyways I doubt you will see any complex numbers for at least the next 4 years of your live

and no (-1)^2 would be 1, the thing about complex numbers is that they break the rule that you cant have something negative under the sqrt

olive willow Jun 1, 2019, 1:30 PM

#

yh I know that

#

I already know that

#

that's why it has to be in ()

earnest prawn Jun 1, 2019, 1:31 PM

#

yes and the spceial thing about complex numbers is that they allow it

olive willow Jun 1, 2019, 1:31 PM

#

yh

#

so for data science I need linear algebra, calc and stats

earnest prawn Jun 1, 2019, 1:32 PM

#

i dont exactly get the transition here but yes

olive willow Jun 1, 2019, 1:32 PM

#

the main math subjects

#

I mean

earnest prawn Jun 1, 2019, 1:33 PM

#

yes

olive willow Jun 1, 2019, 1:33 PM

#

calculus, statistics and linear algebra

earnest prawn Jun 1, 2019, 1:36 PM

#

if youre waiting for a second yes

#

yes

olive willow Jun 1, 2019, 1:36 PM

#

sure hahahaha

#

one more question:

#

guys so if were talking about a '2 dimensional real coordinate space' it has to be inside a tuple in python like this ([3,4], [4,3])

#

right?

#

so the R^2

earnest prawn Jun 1, 2019, 1:38 PM

#

what has to be in a tuple

#

what are those supposed to represent

olive willow Jun 1, 2019, 1:39 PM

#

vectors

earnest prawn Jun 1, 2019, 1:39 PM

#

thats supposed to be one vector?

olive willow Jun 1, 2019, 1:39 PM

#

like every vector you can make with that set of numbers

simple frigate Jun 1, 2019, 1:40 PM

#

not exactly my defeinition of vector but okay

olive willow Jun 1, 2019, 1:40 PM

#

hahahaha

#

yh those are two vectors

#

it's supposed to be a 2 dimensional real coordinate space

#

in python

#

like an example of one

#

so R^2

reef bone Jun 1, 2019, 1:44 PM

#

You can check out Essence of linear algebra on youtube, it's a fairly good introduction

#

https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab

YouTube

Essence of linear algebra - YouTube

A geometric understanding of matrices, determinants, eigen-stuffs and more.

olive willow Jun 1, 2019, 1:44 PM

#

yh'\

reef bone Jun 1, 2019, 1:45 PM

#

[1, 2, 3] would be a 3D vector

olive willow Jun 1, 2019, 1:45 PM

#

I know 3b1b

reef bone Jun 1, 2019, 1:45 PM

#

It gives the vector's magnitudes along 3 axes

olive willow Jun 1, 2019, 1:45 PM

#

yh

#

I know that

#

but in code, how would you represent a 2 dimensional real coordinate space

earnest prawn Jun 1, 2019, 1:45 PM

#

im stil trying to wrap my head around what you want to express with a tuple of 2 vectors

olive willow Jun 1, 2019, 1:46 PM

#

I sec lemme show you where I found it

#

https://www.khanacademy.org/math/linear-algebra/vectors-and-spaces/vectors/v/real-coordinate-spaces?modal=1

Khan Academy

Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance, history, and more. Khan Academy is a nonprofit with the mission of providing a free, world-class education for anyone, anywhere.

lean ledge Jun 1, 2019, 1:48 PM

#

@olive willow

but what's the difference between a matrix and a 2d tensor

For computer scientists who have 0 respect for mathematical definitions and grace: a matrix is a 2d tensor

For actual people who understand maths: a matrix is a representation of a rank 2 tensor. A multidimensional array is to a tensor what a matrix is to a linear transformation

#

A tensor is more fundamentally a linear transformation that transforms in a particular way with a chance of coordinates

olive willow Jun 1, 2019, 1:49 PM

#

one sec searching linear transformation up

stoic beacon Jun 1, 2019, 1:57 PM

#

Oh Raggy

olive willow Jun 1, 2019, 2:07 PM

#

what does linear combination mean???

#

the ``` A1V1 + A2V2 + A3V3 ..... + AnVn

#

A = scalar, V = vector

earnest prawn Jun 1, 2019, 2:17 PM

#

a linear combination is a sum of vectors which are each multiplied with a scalar, you can do some interesting stuff with that in geometry for example

olive willow Jun 1, 2019, 2:23 PM

#

so if I've two vectors V[4,5] and A[2,7]

#

the scalars are 4 for x and 5 for y

#

and 2 for x and 7 for y

earnest prawn Jun 1, 2019, 2:24 PM

#

no

#

thats not what they mean

olive willow Jun 1, 2019, 2:24 PM

#

the basis vectors

earnest prawn Jun 1, 2019, 2:24 PM

#

5 * [1,2] + 3 * [2,10]

scalars are 5 and 3 vectors should be obvious

olive willow Jun 1, 2019, 2:24 PM

#

yh I know that

#

https://www.youtube.com/watch?v=k7RM-ot2NWY

YouTube

3Blue1Brown

Linear combinations, span, and basis vectors | Essence of linear a...

Home page: https://www.3blue1brown.com/ The fundamental vector concepts of span, linear combinations, linear dependence, and bases all center on one surprisi...

▶ Play video

#

sec 40

#

I mean with that kind of thinking

earnest prawn Jun 1, 2019, 2:26 PM

#

yeah he wants you to think about it as scalars

#

which makes sense

#

but they are not scalars

olive willow Jun 1, 2019, 2:26 PM

#

oooh so it isn't like a function you really use?

earnest prawn Jun 1, 2019, 2:27 PM

#

he is just trying to explain to you how a vector works and yes you can mathematically express a vector like he does buuuuut if youd name the elements of a vector scalars youre gonna confuse some people

olive willow Jun 1, 2019, 2:27 PM

#

oohh sry for that

earnest prawn Jun 1, 2019, 2:28 PM

#

especially when you bring up a linear combination before where scalars are very important

olive willow Jun 1, 2019, 2:28 PM

#

sure

#

could you give my an example of this a linear combination is a sum of vectors which are each multiplied with a scalar

#

like the sum, are you just supposed to add them together

#

like [4,7] + [2,5] = [6,12]

#

if the have the same scalar

earnest prawn Jun 1, 2019, 2:31 PM

#

if you multiply a vector by a scalar you just multiply each element of the vector with that scalar

#

and yes adding vectors works like you just showed

olive willow Jun 1, 2019, 2:31 PM

#

but do they have to have the same scalar or not the two vectors

earnest prawn Jun 1, 2019, 2:31 PM

#

of course not

#

for example my linear combination

#

5 * [1,2] + 3 * [2,10] = 
[5, 10] + [6, 30] = 
[11, 40]

olive willow Jun 1, 2019, 2:32 PM

#

so [11, 40] is the linear combination

earnest prawn Jun 1, 2019, 2:33 PM

#

the result of it

olive willow Jun 1, 2019, 2:33 PM

#

of vectors [1,2] and [2,10] after the scalars has been applied

#

yh the result

earnest prawn Jun 1, 2019, 2:34 PM

#

yes

olive willow Jun 1, 2019, 2:35 PM

#

I understand now thanks dude a lot for the help you've given me!

#

cuz it's not that easy to understand the concepts at my age, that's why I ask so many questions

earnest prawn Jun 1, 2019, 2:36 PM

#

(im only two years older than you, you can get there 👍 )

olive willow Jun 1, 2019, 2:36 PM

#

hahaha thanks!

#

two years is a lot if you learning a lot

earnest prawn Jun 1, 2019, 2:38 PM

#

I mean that linear combination and vector stuff is taught at schools here

#

(well taught to 17-18 year olds at school but still)

olive willow Jun 1, 2019, 2:38 PM

#

yh

#

I'm 14

#

so 3 to 4 years

#

we currently have how to find out what the content is of geometric forms

#

and how to use you can say scalars to get a bigger form from the basic one

earnest prawn Jun 1, 2019, 2:40 PM

#

do you mean volume?

olive willow Jun 1, 2019, 2:41 PM

#

yh I'm not taught in english so idk the names but yh

#

cm3

#

for example

earnest prawn Jun 1, 2019, 2:42 PM

#

yes volume

vestal axle Jun 1, 2019, 2:48 PM

#

Hello, anyone here familiar with mean variance optimization problem together with the Black Litterman?

#

Aka reverse optimization

#

I need some help with matrix multiplications, I have a transposed matrix with 10 rows and 156 columns, which should be multiplied with another matrix that has 156 rows and 10 columns. Can I just multiply these two together, or should I transpose the first matrix again?

deft harbor Jun 1, 2019, 5:16 PM

#

@earnest prawn where are you that they teach L.A. at 17?

earnest prawn Jun 1, 2019, 5:17 PM

#

Nah they teach analytical geometry at 17

#

I was talking specifically about linear combinations and vectors

misty sonnet Jun 1, 2019, 5:17 PM

#

Nix is a smrt boy

stoic beacon Jun 1, 2019, 5:30 PM

#

Probably some magnet school lol

olive willow Jun 1, 2019, 5:40 PM

#

I'm teaching myself at 14 lol

#

btw @earnest prawn thanks again for explaining it

earnest prawn Jun 1, 2019, 5:43 PM

#

@stoic beacon no basic analytical geometry is taught to everyone in germany who visits the highest form of high school and is in the 11th grade aka usually 17 years old

stoic beacon Jun 1, 2019, 5:43 PM

#

This is why the US is behind lol

earnest prawn Jun 1, 2019, 5:43 PM

#

i mean we are also taught basic calculus at that age too

#

which does not mean everyone understands it though...

#

or remembers it for longer than A levels

misty sonnet Jun 1, 2019, 5:52 PM

#

@stoic beacon I mean. Germany's unis ain't great

#

Americas are

#

It's not really fair to only compare on part of a education system

#

You need to compare the whole thing

#

And to that end: They are all crap

earnest prawn Jun 1, 2019, 6:12 PM

#

why are our unis not great 😦

#

@misty sonnet

misty sonnet Jun 1, 2019, 6:14 PM

#

Well, you have Switzerland

earnest prawn Jun 1, 2019, 6:14 PM

#

what

olive willow Jun 1, 2019, 6:16 PM

#

??

misty sonnet Jun 1, 2019, 6:17 PM

#

Well, it's not that they are bad

#

They are good

#

I just don't think you guys have a top 20 uni?

#

If I am wrong: Fair play, I apologize

earnest prawn Jun 1, 2019, 6:17 PM

#

no i mean

#

how are switzerland unis related to ours

misty sonnet Jun 1, 2019, 6:18 PM

#

Albert Einstein

#

:^)

olive willow Jun 1, 2019, 6:18 PM

#

hahahaha

daring spindle Jun 1, 2019, 7:16 PM

#

Do you guys like the tensorflow ML NN tutorial?

#

And should I start with the google crash course before that?

stoic beacon Jun 1, 2019, 8:51 PM

#

There's a Google crash course?

daring spindle Jun 1, 2019, 9:16 PM

#

Yes

lean ledge Jun 1, 2019, 9:37 PM

#

https://www.reddit.com/r/MachineLearning/comments/bvibcj/discussion_the_6_types_of_data_scientist/

r/MachineLearning - [Discussion] The 6 types of data scientist:

92 votes and 41 comments so far on Reddit

earnest prawn Jun 1, 2019, 9:46 PM

#

@lean ledge and you are what type?

lean ledge Jun 1, 2019, 9:47 PM

#

Too inexperienced so far to count as one :P closest based on description is probably 2

daring spindle Jun 1, 2019, 10:41 PM

#

I am that guy who is still trying to find a goddamn mediocre course.

#

smh\

fleet crag Jun 1, 2019, 10:41 PM

#

Hey guys, I'm a bit inexperienced regarding machine learning - but is it possible to find the best combination of a,b & c in the equation y=ax^2 + bx + c with a given data set via machine learning? As of right now I'm only familiar with BNN and how to use it for image recognition, and I can't for the life of me redefine my problem such that a BNN could solve it. Unless there is another type of ML that can do this?

daring spindle Jun 1, 2019, 10:45 PM

#

Yo should I try the google crash course

#

and after that do some tensorflow basics

#

or pytorch

#

depends

fleet crag Jun 1, 2019, 10:48 PM

#

I usually go to this one

#

https://pythonprogramming.net/

Python Programming Tutorials

Python Programming tutorials from beginner to advanced on a massive variety of topics. All video and text tutorials are free.

#

@daring spindle

stoic beacon Jun 1, 2019, 10:49 PM

#

@daring spindle what's the link for that course?

#

I hear good things about Keras btw

daring spindle Jun 1, 2019, 10:51 PM

#

https://developers.google.com/machine-learning/crash-course/ml-intro

Google Developers

Introduction to Machine Learning | Machine Learning Crash Cour...

#

@stoic beacon

stoic beacon Jun 1, 2019, 10:51 PM

#

Oh nice thanks

daring spindle Jun 1, 2019, 10:52 PM

#

Here I got send her by the tensorflow guide

#

so I think after this

#

you should do tensorflow

#

and then your probs set with the basics

stoic beacon Jun 1, 2019, 10:52 PM

#

TensorFlow is a little advanced for me. Too much control and too many knobs to turn

daring spindle Jun 1, 2019, 10:53 PM

#

Yes but the google course

#

will learn you about tensorflow

stoic beacon Jun 1, 2019, 10:53 PM

#

Oh that's nice

#

Thanks I'll look into it.

daring spindle Jun 1, 2019, 10:53 PM

#

https://developers.google.com/machine-learning/crash-course/first-steps-with-tensorflow/video-lecture

Google Developers

First Steps with TensorFlow | Machine Learning Crash Course ...

#

Here its like

#

the 4th 5th lecture

stoic beacon Jun 1, 2019, 10:54 PM

#

Awesome!

#

I'll look into these two

#

I may use Keras for NN stuff at first. It's also supported by Google and has good docs I think

#

TensorFlow seems more advanced and for fine tuning big models and atuff

lean ledge Jun 2, 2019, 12:31 AM

#

@fleet crag yes it is but if your dataset isn't massive, just use normal equation for least squares

#

No need for ML when you can get an optimal answer unless it's massive dataset

fleet crag Jun 2, 2019, 12:37 AM

#

@lean ledge Although very true, I'd love to know how. Are there any papers/articles regarding the matter?

lean ledge Jun 2, 2019, 12:38 AM

#

Papers for ML or for normal equation? Both are pretty basic content so I won't be able to find any papers for it but I can link resources

#

📎 C6.pdf

#

📎 C5.pdf

#

Notes from my second year math class

fleet crag Jun 2, 2019, 12:45 AM

#

ah thanks, I actually meant for ML. But these are nice to refresh my maths again 😃 Regarding the ML part, I'll try to research myself and see if I get my answers 😄

lean ledge Jun 2, 2019, 12:47 AM

#

It shouldnt be very hard to do this using ML either. It's sort of just linear regression with an augmented dataset. When you have the dataset 10 = ax^2 + bx + c at x = 3, you just need to fit 9a+3b+c=10. Make your loss function, do linear regression on it with the 3 variables

#

[[1 x1 x1^2],
[1 x2 x2^2],
....]

#

multiplied by [[a], [b], [c]]

fleet crag Jun 2, 2019, 12:49 AM

#

yea. but arent we coming back to least squares again?

lean ledge Jun 2, 2019, 12:49 AM

#

= [[y1],[y2],...]

#

yes

#

least squares is your loss function

#

well

fleet crag Jun 2, 2019, 12:50 AM

#

well, I meant the method xd but I get it haha

lean ledge Jun 2, 2019, 12:50 AM

#

"least squares problems" means "solving for coefficients for a problem that minimises L2 distance"

#

HOW you solve for coefficients depends

mossy dragon Jun 2, 2019, 12:50 AM

#

couldn't you just use forward/backwards selection to figure out best variables?

lean ledge Jun 2, 2019, 12:51 AM

#

the method i gave solves for it analytically because least squares is a simple problem in linear algebra

#

with a simple analytical solution

#

the only problem being that the analytical solution doesnt scale well to massive datasets

mossy dragon Jun 2, 2019, 12:52 AM

#

nvm i thought you guys were talking about variable selection in linear regression

lean ledge Jun 2, 2019, 12:52 AM

#

that's when you use ML because ML doesnt give the optimal answer usually but its an approximate answer faster

#

Anyways, there's a lot of resources to learn ML around. Columbia's course is my recommendation. If you dont know the basics, I'd start here @fleet crag https://www.edx.org/course/machine-learning-columbiax-csmm-102x-0?utm_medium=affiliate_partner&utm_source=CredEdLLC

edX

Machine Learning

Master the essentials of machine learning and algorithms to help improve learning from data without human intervention.

#

Given you said the maths I linked is a refresher, it should be fine for you

fleet crag Jun 2, 2019, 12:58 AM

#

sweet 😃

#

thanks @lean ledge

lean ledge Jun 2, 2019, 12:58 AM

#

nw

mossy dragon Jun 2, 2019, 1:01 AM

#

The other day this Econ PHD candidate was telling me how there was so much bad statistics going on in the data science field currently and how that was going to change.

lean ledge Jun 2, 2019, 1:02 AM

#

He's not wrong about bad statistics going on. Dunno about it changing any time soon

mossy dragon Jun 2, 2019, 1:06 AM

#

Yea im wondering about that because when I look at job posts online they sometimes ask for just a CS degree, I don't think CS majors would have the stats neccessary to really excel at the field right? And im not sure its the easiest thing to learn while on the job either.

#

I wonder which person has a higher chance of getting an interview, a person with a stats degree or a person with a CS degree.

silent swan Jun 2, 2019, 1:13 AM

#

in tech A/B testing is considered its own subfield

lean ledge Jun 2, 2019, 1:17 AM

#

@mossy dragon Dunno about interview/hiring process but out of all the quantitative STEM degrees, CS people have some of the worst maths background because at a lot of places the only requirements for maths are maybe calc 2, intro stat and discrete maths, and there's a lot less maths in CS subjects than other degrees

#

Since CS degrees arent really CS degrees and more software degrees, poor maths backgrounds are common

sand reef Jun 2, 2019, 8:16 AM

#

A question regarding HopField neural network. I have been making a small project on it, with pattern recognition of a grid of size 7x7. I am not sure why, but for some reason, the network keeps on only converging to only the latest learnt pattern. I tried the formula for updating and all on a smaller vector only implementation, and it worked perfectly there.

#

I would like to ask someone to see if my implementation of the formulae is correct? Or do I have it messed up?

sand reef Jun 2, 2019, 8:52 AM

#

    def output(self, panel):
        for i in range(7):
            for j in range(7):
                panel.secondMatrix[i][j].setVal(self.matrix[i][j])

    def update(self):
        for i in range(49):
            self.matrix[int(i/7)][i%7] = self.vector[i]

    def runAsync(self, panel, number):
        for i in range(number):
            r = random.randint(0,48)
            temp = 0
            for j in range(49):
                if r != j:
                    temp += panel.learn.unwinded_matrix[r][j]*self.vector[j]
            self.vector[r] = self.sign(temp)
        self.update()
        self.output(panel)```

#

class LearnButton(wx.Button):
    def __init__(self, panel, pos):
        super().__init__(panel, label="Learn", pos=pos)
        self.matrix = []
        self.panel = panel
        for i in range(7):
            temp = []
            for j in range(7):
                temp.append(0)
            self.matrix.append(temp)
        self.Bind(wx.EVT_BUTTON, self.onClick)
        self.energy = 0
        self.unwinded_matrix = [[0 for x in range(49)] for y in range(49)]
        self.unwinded_vector = []
        for i in range(7):
            for j in range(7):
                self.unwinded_vector.append(self.matrix[i][j])
    
    def calcEnergy(self):
        pass

    def onClick(self, event):
        for i in range(7):
            for j in range(7):
                self.matrix[i][j] += self.panel.matrix[i][j].getVal()
        for i in range(7):
            for j in range(7):
                self.unwinded_vector[i*7+j] = self.matrix[i][j]
        for i in range(49):
            for j in range(49):
                if i != j:
                    self.unwinded_matrix[i][j] += (2*self.unwinded_vector[i]-1)*(2*self.unwinded_vector[j]-1)```

#

For some reason, any pattern learnt previously is never converged to when I am running the network. It only converges to the latest learnt pattern. (Did some editing, it now converges to a combination of all the learnt states, the above code was unchanged.) Help? It only converges to the combined state.

olive willow Jun 2, 2019, 1:11 PM

#

what is linear algebra used for in data science, could someone give me an example in code

daring spindle Jun 2, 2019, 1:19 PM

#

The things I have seen in Andrews NG course was like theory

#

Like this

#

📎 image0.png

#

But I am fairly basic

olive willow Jun 2, 2019, 1:19 PM

#

??

#

what's that

daring spindle Jun 2, 2019, 1:19 PM

#

Its theory

#

Like

#

Wait

#

Yeah

olive willow Jun 2, 2019, 1:19 PM

#

I just know vectors, linear combination, span and that's about it

#

matrix and tensor

#

also

daring spindle Jun 2, 2019, 1:20 PM

#

I am literally 13 all the algebra I have seen came from ML courses

olive willow Jun 2, 2019, 1:21 PM

#

I'm 14 but I know algebra but you shouldn't do ML when you're 13

#

you need linear algebra, calc and stats to understand it

daring spindle Jun 2, 2019, 1:21 PM

#

Why wouldnt I may grades allow me to spend time on it and I love it

#

I’ll pic that up on the way

olive willow Jun 2, 2019, 1:21 PM

#

no dude

#

you won't

#

that's almost college level

#

you only will know what it's about

#

not how to do it the right way

daring spindle Jun 2, 2019, 1:22 PM

#

I like it. My math teacher helps me with the hard parts

olive willow Jun 2, 2019, 1:23 PM

#

But it doesn't matter if you like it, it matters if you understand it. and if you're 13 you didn't even had linear algebra at school. I don't think even stats

daring spindle Jun 2, 2019, 1:23 PM

#

Anyone can learn everything some faster as others. But everyone can learn

sand reef Jun 2, 2019, 1:23 PM

#

linear algebra is just matrix and vector manipulation

#

its simple and anyone can learn it

olive willow Jun 2, 2019, 1:23 PM

#

yh but we're talking about ML

#

the whole thing

#

calc

sand reef Jun 2, 2019, 1:24 PM

#

yeah, i know, i have also done ml and deep learning

olive willow Jun 2, 2019, 1:24 PM

#

but he's 13

#

that's the thing

sand reef Jun 2, 2019, 1:24 PM

#

no worries, he can do it, if he likes it

daring spindle Jun 2, 2019, 1:24 PM

#

10 - 20 hours a week

#

Thats my goal

olive willow Jun 2, 2019, 1:25 PM

#

yh but look, do you even know what the symbol sigma is

daring spindle Jun 2, 2019, 1:25 PM

#

Yes

sand reef Jun 2, 2019, 1:25 PM

#

yeah, just don't overwork yourself

daring spindle Jun 2, 2019, 1:25 PM

#

I learned in andrews NG’s course

olive willow Jun 2, 2019, 1:25 PM

#

just start at the basics

#

learn slowly, not that you can't do it but chill you have time

sand reef Jun 2, 2019, 1:25 PM

#

say, anyone of you guys knows about hopfield neural networks?

#

i srsly need help with that

olive willow Jun 2, 2019, 1:25 PM

#

not really

#

ask nix if he's on maybe he knows

sand reef Jun 2, 2019, 1:26 PM

#

he is off

olive willow Jun 2, 2019, 1:26 PM

#

then idk

#

but yh calc at 13 hhmmmmm

sand reef Jun 2, 2019, 1:27 PM

#

i m literally on the end of my small project, and freakin, idk what i am calculating wrong T^T

olive willow Jun 2, 2019, 1:27 PM

#

ask at stack

#

maybe it will help

#

or a yt vid

sand reef Jun 2, 2019, 1:27 PM

#

i guess i could try stack, but yt, nah

olive willow Jun 2, 2019, 1:28 PM

#

yh

#

or ask in the help section

sand reef Jun 2, 2019, 1:28 PM

#

i asked, they pointed me here

olive willow Jun 2, 2019, 1:28 PM

#

oohh lol, yh sure

#

Idk dude

#

I'm learning linear algebra for data science. like data matrix and vectors

sand reef Jun 2, 2019, 1:29 PM

#

mhm....i'll just wait for someone to help

olive willow Jun 2, 2019, 1:29 PM

#

and guess from which country my yt tutorial guy is

#

india

sand reef Jun 2, 2019, 1:30 PM

#

nice

olive willow Jun 2, 2019, 1:30 PM

#

yup

daring spindle Jun 2, 2019, 1:30 PM

#

If he links you to tech support

#

Cut it

sand reef Jun 2, 2019, 1:31 PM

#

xD, but nah

#

@earnest prawn , I have been told that you could help. Could you help me with HopField Neural Network Implementation in python? I have a major issue in making of a small project.

earnest prawn Jun 2, 2019, 1:54 PM

#

Sorry but that's something I've not heard of until today, can't help you with that

sand reef Jun 2, 2019, 2:06 PM

#

Oh okay. Thanks anyways.

granite basin Jun 2, 2019, 5:59 PM

#

how would one use clustering for image data? I'm not getting very good results, even after applying PCA

stoic beacon Jun 2, 2019, 6:51 PM

#

Generally, if a date/time is involved in your data, is it going to be better suited to time series analysis?

#

I'm trying to find a fun dataset to work with but keep picking ones with dates or over a period of time

olive willow Jun 2, 2019, 7:06 PM

#

do you want a pokemon dataset?

#

or fifa 19 one

#

or world bank stats?

#

@stoic beacon

stoic beacon Jun 2, 2019, 7:06 PM

#

not sure haha

olive willow Jun 2, 2019, 7:07 PM

#

📎 country.csv

#

📎 pokemon.csv

#

here you have two

stoic beacon Jun 2, 2019, 7:08 PM

#

interesting

olive willow Jun 2, 2019, 7:12 PM

#

yup

#

I just asked myself so question I would like to find the answer to

#

and then used that data to find them, I suggest you to do the same thing

stoic beacon Jun 2, 2019, 7:18 PM

#

yeah

#

we'll see lol

#

I have a lot on my plate

olive willow Jun 2, 2019, 7:19 PM

#

sure np

stoic beacon Jun 2, 2019, 7:20 PM

#

where did you find those btw?

olive willow Jun 2, 2019, 7:20 PM

#

www.kaggle.com

#

they have every kind of dataset you can imagen for free

stoic beacon Jun 2, 2019, 7:21 PM

#

ah okay yeah ive been there

onyx granite Jun 2, 2019, 8:10 PM

#

kaggle is awesome

olive willow Jun 2, 2019, 8:28 PM

#

yh

teal night Jun 2, 2019, 9:49 PM

#

@stoic beacon Well what would be Your preferences to working over a dataset?

stoic beacon Jun 2, 2019, 10:03 PM

#

@teal night what do you mean?

teal night Jun 2, 2019, 10:54 PM

#

Quantity?

#

complexity of the dataset

sand reef Jun 3, 2019, 4:46 AM

#

If your data is sequential or the date / time is a major factor in it, yes then you can use time series analysis.

#

@granite basin you mean using clustering for classification? If yes, how are you taking your features? Are you extracting the features using algorithms?

lapis sequoia Jun 3, 2019, 8:48 AM

#

is anyone alive

#

I'm having trouble assigning new columns

#

def split_semantic_path(df_rows):
  semantic_paths = re.findall(ARGUMENT_PATTERN, df_rows, re.DOTALL)
  return semantic_paths

data_df[['semantic_path0', 'semantic_path1', 'semantic_path2']] = data_df['semantics'].apply(split_semantic_path)

#

but it tells me those columns don't exist.. they don;t.. this call is supposed to create them

lapis sequoia Jun 3, 2019, 9:18 AM

#

nvm I got it

#

had to pass the return as pd.Series

fleet crag Jun 3, 2019, 9:26 AM

#

Is it possible train a neural network to solve an optimization problem? (even though the training of the NN is an optimization problem itself)

lapis sequoia Jun 3, 2019, 9:27 AM

#

hmmmm

#

no

granite basin Jun 3, 2019, 11:29 AM

#

@sand reef I used PCA to select the features, it does cluster the data, which is unlabeled, but with just some manual checking I can already tell that it's not very accurate

lost sinew Jun 3, 2019, 12:26 PM

#

how do i do a cross correlation to find the time lag/lead of my data

#

i have charted out the correlation between all items in a pearson correlation chart

#

is there a way to find a cross-correlation of each ?

lapis sequoia Jun 3, 2019, 12:35 PM

#

you mean correlation of each item vs each other.. that's exactly what you need to do

lost sinew Jun 3, 2019, 1:50 PM

#

avg_btc_price_usd price_usd
1 -0.057079 -0.384172
2 -0.088811 -0.110334
3 -0.047064 0.301020
4 0.003190 0.291260
5 -0.006247 0.419880
6 0.012485 0.266879
7 0.099603 -0.155015
8 0.059023 -0.206790
9 -0.001597 -0.010660
10 0.001780 -0.126942

#

how would i cross correlate this dataframe?

desert oar Jun 3, 2019, 2:10 PM

#

you just want the correlation between the two series?

#

maybe http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.corr.html#pandas.DataFrame.corr

sand reef Jun 3, 2019, 2:12 PM

#

@granite basin generally,clustering isn't very accurate in terms of image classification because of feature selection. If I am not wrong.

#

So it could very possibly be the features selected that might be causing an issue.

#

@fleet crag not sure, but if you can represent your optimization problem in such a way, I think there are some models that do converge to a global minima. Something like hopfield networks and all.

fleet crag Jun 3, 2019, 2:18 PM

#

Funnily enough, I'm reading "Neural Computation of decisions in Optimization Problem " by Hopfield himself (1985) at this moment @sand reef

#

Whereas he uses a NN to solve the travelling sales man problem

granite basin Jun 3, 2019, 2:36 PM

#

@sand reef Hmm yea I think you're right, I have an unlabeled dataset of written numbers, and a dataset of spoken numbers. The only 'labels' I have are if they describe the same thing or not. I wanted to label the images with clustering and feed this to the spoken data but accuracy is not very good

lean ledge Jun 3, 2019, 3:02 PM

#

@fleet crag Yes, many ways for neural nets to solve optimization problems

#

A lot and lot of stuff that neural networks are used for nowadays used to be treated as DP problems or something equally Bellman-y

#

Neural networks are just function approximators. How you use them is up to you

#

eg. Reinforcement learning is an attempt at approximate optimal control theory. Deep RL is using NNs to (ideally) solve optimal control problems

sand reef Jun 3, 2019, 3:35 PM

#

I see. So I am happy, that I was right. Yey!

naive shore Jun 3, 2019, 3:38 PM

#

hi! i'm trying to extract frequencies from short samples of data with Numpy's FFT, but seems it can't catch the wavelength marked with red or green dots. Is it mathematically impossible with FFT to do this on such a short segment?

📎 unknown.png

#

i hope its data science sorry if im wrong grumpchib

sand reef Jun 3, 2019, 3:51 PM

#

Fast Fourier Transformation? I know what it is and what it does, but I do not know what are its limitations. Although I can check it up though.

naive shore Jun 3, 2019, 3:53 PM

#

well if can and if you want )
it actually registers what i need (marked with dots) but its like not the main result, just among the garbage

stuck prawn Jun 3, 2019, 3:54 PM

#

Guys, anyone heard about DataQuest? Is the paid plan worth to get the technical skills for a job as a fresher in Data Science field?

sand reef Jun 3, 2019, 3:57 PM

#

@naive shore do you have the code for it? How have you generated the function for it?

naive shore Jun 3, 2019, 3:57 PM

#

i'll give you my python with fft and a text file with the waveform ok?

#

==========================================================
import numpy as np

fname = "note2.txt"
with open(fname) as f:
x = f.readlines()
w = np.fft.fft(x)
freqs = np.fft.fftfreq(len(w))
#np.fft.fftshift(freqs)

idx = np.argmax(np.abs(w))
idx2 = np.where(np.abs(w)>80000)
#print(w)
freq = freqs[idx2]
freq_in_hertz = abs(freq * 44100)
print(freq_in_hertz)

📎 note2.txt

sand reef Jun 3, 2019, 3:58 PM

#

Sure. I'll try my best to see what went wrong. Although I am thinking this might be beyond me.

naive shore Jun 3, 2019, 3:59 PM

#

its "note" like musical note pitch )

sand reef Jun 3, 2019, 4:00 PM

#

I see. And for some reason the output is skipping two frequencies?

naive shore Jun 3, 2019, 4:01 PM

#

not skipping but they are not returned as main one. well actually its the same freq with green and red dots

#

242.97

#

this one

sand reef Jun 3, 2019, 4:02 PM

#

I see. So just one frequency is not being returned by the fft function?

naive shore Jun 3, 2019, 4:03 PM

#

it is returned but its like not the main one
and it is obvious looking at waveform

#

idx2 = np.where(np.abs(w)>80000)
this part filters results

#

when 'the bar' is 80000 the needed frequency is shown among others.
but if i rise the bar so there whould be one single frequency - its not there

#

so i cant filter specially it

#

i know the exact value for this segment , but i need program to see it as the one i need too

sand reef Jun 3, 2019, 4:08 PM

#

So, when you set value for np.abs(w) > some high value, it's not filtering out

#

Where it's supposed to be the highest frequency in the entire waveform?

#

Is it the highest frequency in the entire waveform?

naive shore Jun 3, 2019, 4:09 PM

#

not the highest, but its still main

#

it leaves the higher octave of freq i need

#

is it how it should work?

sand reef Jun 3, 2019, 4:11 PM

#

The np.where(np.abs(w) > 8000) would mean, all the frequencies of the waveform who have a value greater than 8000

#

So, only the frequencies greater than 8000 should be returned right?

naive shore Jun 3, 2019, 4:13 PM

#

no. im not fully aware how it works, but 80000 is not a frequency limit, but kind of a number of times fft finds particlar frequency
i guess....

sand reef Jun 3, 2019, 4:14 PM

#

So now I am confused. Doesn't FFT return all the frequencies used to make a certain waveform?

naive shore Jun 3, 2019, 4:15 PM

#

kind of. but its several steps
the fft function returns "SOMETHING" that then be converted to frequencies with fftfreq function

sand reef Jun 3, 2019, 4:15 PM

#

And if that is the case, then np.where(np.abs(w) >8000 ) should mean values in the array where values of the array are greater than 8000 right?

#

I see. Let me see what it returns.

#

So, fft returns some complex number values

naive shore Jun 3, 2019, 4:19 PM

#

"This function computes the one-dimensional n-point discrete Fourier Transform (DFT) with the efficient Fast Fourier Transform (FFT) algorithm [CT]."
what the docs says/
gosh im bad at math 😃

#

yeah, when i red about fft i've met something about complex numbers

sand reef Jun 3, 2019, 4:21 PM

#

And the frequencies are returned by the fftfreq function

naive shore Jun 3, 2019, 4:21 PM

#

yes

desert oar Jun 3, 2019, 4:22 PM

#

fft freq is the x axis of the frequency plot

sand reef Jun 3, 2019, 4:22 PM

#

So, abs(w) = sqrt (a^2 + b^2)?

desert oar Jun 3, 2019, 4:22 PM

#

in fact theres an example in the docs...https://docs.scipy.org/doc/numpy/reference/generated/numpy.fft.fft.html

naive shore Jun 3, 2019, 4:24 PM

#

well yeah i took the code from examples. it shows correct frequencies, the thing is numpy doesnt see main frequency as main one

sand reef Jun 3, 2019, 4:24 PM

#

So that would mean that your X and Y axis values, the rotating vector formed, it's length has to be greater than 8000?

naive shore Jun 3, 2019, 4:25 PM

#

can we agree that the wavelength between dots is like the main one? or is it just my imagination 😃

📎 unknown.png

desert oar Jun 3, 2019, 4:25 PM

#

i dont see any values > 80000

#

oh wait sorry

#

thats in the FFT

#

not in the data

sand reef Jun 3, 2019, 4:25 PM

#

Yus

desert oar Jun 3, 2019, 4:26 PM

#

i was confused hah

naive shore Jun 3, 2019, 4:26 PM

#

its a signal from guitar, so that frequency is the only right one

sand reef Jun 3, 2019, 4:26 PM

#

https://stackoverflow.com/questions/47336723/what-exactly-does-np-fft-fft-return

Stack Overflow

What exactly does np.fft.fft return?

I'm confused about understanding FFT's and how to apply them in python. From my understanding applying an fft to a 10-pixel 1D array should contain a list of 10 numbers (+2 for the "DC" component):...

naive shore Jun 3, 2019, 4:26 PM

#

i know cause i played it )

sand reef Jun 3, 2019, 4:26 PM

#

This. Tells what fft returns

desert oar Jun 3, 2019, 4:26 PM

#

@naive shore do you know what that frequency is approximately?

naive shore Jun 3, 2019, 4:26 PM

#

242.97520661 this

#

it shows up but among others

#

if i rise the filter - it leaves the octave of that,. so like twice of that, 485.9504132

desert oar Jun 3, 2019, 4:28 PM

#

whats the unit here?

naive shore Jun 3, 2019, 4:28 PM

#

hz

#

freq in Hz
and samples is 1/44100

desert oar Jun 3, 2019, 4:29 PM

#

how are you getting that from the fftfreq output

#

ahhh ok

#

yeah signal processing is probably the one area where i'm truly newbie

#

In [50]: fft_freqs[data_fft > 80000] * 44100
Out[50]: array([484.61538462, 969.23076923])

naive shore Jun 3, 2019, 4:31 PM

#

yeah, almost there, just need half of that first number )

sand reef Jun 3, 2019, 4:31 PM

#

So. abs(w) > 8000 means all values whose amplitude is greater than 8000

#

I see now.

desert oar Jun 3, 2019, 4:32 PM

#

import matplotlib.pyplot as plt
import numpy as np

with open('note2.txt') as f:
    data = np.array([float(line.strip()) for line in f])

data_fft = np.fft.rfft(data)
fft_freqs = np.fft.fftfreq(data_fft.size)

plt.plot(fft_freqs * 44100, data_fft, '.-')
plt.xlim((-10000, 10000))
plt.grid()
plt.show()

#

yeah @sand reef thats just numpy

#

if nobody can step in and figure this out, might be a good one for math.stackexchange.com

#

or some equivalent site. maybe stackoverflow

olive willow Jun 3, 2019, 4:34 PM

#

what's wrong?

desert oar Jun 3, 2019, 4:34 PM

#

signal processing. FFT isnt picking up on what ought to be the dominant frequency

sand reef Jun 3, 2019, 4:34 PM

#

I need to know. Since things are making some sense to me now. Which statement is causing the issue? As in which print statement is getting the erroraneous part?

desert oar Jun 3, 2019, 4:34 PM

#

trying to figure out why. this is not my area of expertise

olive willow Jun 3, 2019, 4:35 PM

#

can you send the txt file?

desert oar Jun 3, 2019, 4:35 PM

#

@sand reef the maximum aplitude of the data_fft in my code ought to be around 243

sand reef Jun 3, 2019, 4:35 PM

#

I see.

desert oar Jun 3, 2019, 4:35 PM

#

https://cdn.discordapp.com/attachments/366673247892275221/585135201923891210/note2.txt

#

hmm @naive shore is there some reason it would double the frequency?

#

im wondering if maybe us in all our FFT noobness is missing something simple in how its supposed to be used

sand reef Jun 3, 2019, 4:35 PM

#

Is the file being read right?

desert oar Jun 3, 2019, 4:36 PM

#

yes thats what my code does

naive shore Jun 3, 2019, 4:36 PM

#

well the guitar waveform by itself consists of the main freq and harmonics (octaves of that). so thats the difficulty

desert oar Jun 3, 2019, 4:36 PM

#

i dont think thats it though

#

you were wise to realize that its about 2x the correct frequency

#

so i think we are just misusing FFT

sand reef Jun 3, 2019, 4:38 PM

#

Fft is returning the amplitude times cos of phase angle plus amp times sin of phase angle right?

olive willow Jun 3, 2019, 4:38 PM

#

this is the error right?

#

  return array(a, dtype, copy=False, order=order)```

desert oar Jun 3, 2019, 4:38 PM

#

no

#

its not a code error

#

its a logic error

olive willow Jun 3, 2019, 4:39 PM

#

what do you want the code to do?

sand reef Jun 3, 2019, 4:39 PM

#

So, we need to max out the amplitude by np.max(np.abs(w))?

olive willow Jun 3, 2019, 4:39 PM

#

read it and then plot it?

desert oar Jun 3, 2019, 4:39 PM

#

@olive willow yes, but that's not the question

#

my code does that and it works. that's not what we are discussing

sand reef Jun 3, 2019, 4:40 PM

#

I think I am not getting the error.... I guess.

desert oar Jun 3, 2019, 4:40 PM

#

there is no error

#

the error is "why isn't this returning what i expected it to return"

naive shore Jun 3, 2019, 4:40 PM

#

yes )

sand reef Jun 3, 2019, 4:40 PM

#

Oh.

desert oar Jun 3, 2019, 4:40 PM

#

fft_freqs[np.argmax(data_fft)] should be around 243

olive willow Jun 3, 2019, 4:40 PM

#

yh there is no error

desert oar Jun 3, 2019, 4:40 PM

#

but its more like 486

#

so we are trying to figure out why its 2x what it should be

sand reef Jun 3, 2019, 4:40 PM

#

Oh.

desert oar Jun 3, 2019, 4:41 PM

#

and i suspect its because both arnold and i are relatively inexperienced with this and aren't using it correctly

olive willow Jun 3, 2019, 4:41 PM

#

this is the graph?

📎 Figure_1.png

desert oar Jun 3, 2019, 4:42 PM

#

oh yeah the argmax is actually at 962

#

which is... approximately 4x the required frequency

sand reef Jun 3, 2019, 4:43 PM

#

So, there is a different dominant frequency?

#

Np.argmax, it converts complex numbers to real numbers right?

desert oar Jun 3, 2019, 4:44 PM

#

no

sand reef Jun 3, 2019, 4:44 PM

#

By taking the square root thing.

desert oar Jun 3, 2019, 4:44 PM

#

no

sand reef Jun 3, 2019, 4:45 PM

#

So, what does it takes the max on?

desert oar Jun 3, 2019, 4:45 PM

#

x = [5,2,7,4]
print(np.argmax(x))
print(x[np.argmax(x)])

#

https://docs.scipy.org/doc/numpy/reference/generated/numpy.argmax.html

sand reef Jun 3, 2019, 4:45 PM

#

Yeah. It will return the argument of the max value there right?

desert oar Jun 3, 2019, 4:46 PM

#

"argument" being a term borrowed from math

#

in this case it just means the position of the max in the array

#

as opposed to the value of the max

sand reef Jun 3, 2019, 4:46 PM

#

Yus

#

So, I am asking, the max it calculates.

desert oar Jun 3, 2019, 4:46 PM

#

data_fft is the amplitudes yes

sand reef Jun 3, 2019, 4:47 PM

#

Yus.

desert oar Jun 3, 2019, 4:47 PM

#

fft_freqs[np.argmax(data_fft)] is the frequency that corresponds to the max amplitude

sand reef Jun 3, 2019, 4:47 PM

#

Wait no. Is it? I thought even rfft returned complex numbers? Whose absolute value gave the amplitude?

desert oar Jun 3, 2019, 4:49 PM

#

oh. probbaly the magnitude of the complex number

#

here can do it w/ real part

fft_freqs[np.argmax(data_fft.real)]

#

same answer

sand reef Jun 3, 2019, 4:49 PM

#

Please try, this if it works

#

fft_freqs[np.argmax(np.abs(data_fft))]

#

Since I am on my phone, I can't check if it works or not.

desert oar Jun 3, 2019, 4:51 PM

#

same answer

sand reef Jun 3, 2019, 4:51 PM

#

Fk.

#

Is the answer supposed to be 243?

#

If it is, yeah, this is beyond me then.

#

Sorry for not being able to help.

desert oar Jun 3, 2019, 4:54 PM

#

yeah. again i suspect this is "user error"

#

4x off is too close to correct to be truly wrong

naive shore Jun 3, 2019, 4:57 PM

#

maybe its numpy fault

desert oar Jun 3, 2019, 4:59 PM

#

no?

#

lol

#

scipy is built on top of numpy

naive shore Jun 3, 2019, 4:59 PM

#

sory just a bad joke )

sand reef Jun 3, 2019, 5:03 PM

#

Oof.

naive shore Jun 3, 2019, 5:03 PM

#

what?

#

too close?

#

)

sand reef Jun 3, 2019, 5:04 PM

#

Yus

#

Exactly 4x off.

naive shore Jun 3, 2019, 5:13 PM

#

doing some test with other waveforms. its always 2x with my code. probably should just divide by 2 and be with it )

stoic beacon Jun 3, 2019, 5:14 PM

#

Halp halp I need halp

#

Linear transformations wuuuuut

#

Watching 3blue1brown on them and I'm stuck at 4:21

#

No link cuz I'm on my phone and for some reason Google won't include a share button that has a timestamp option

sand reef Jun 3, 2019, 5:19 PM

#

Wat?

#

What happened?

stoic beacon Jun 3, 2019, 5:19 PM

#

I'm stuck on understanding linear transformations

sand reef Jun 3, 2019, 5:19 PM

#

Okay....?

stoic beacon Jun 3, 2019, 5:20 PM

#

So I need help understanding them

sand reef Jun 3, 2019, 5:20 PM

#

What do you not understand?

#

About them?

stoic beacon Jun 3, 2019, 5:20 PM

#

How they're calculated and represented yeah

sand reef Jun 3, 2019, 5:22 PM

#

You mean stuff like translation, rotation and scaling?

stoic beacon Jun 3, 2019, 5:22 PM

#

Mhm

sand reef Jun 3, 2019, 5:24 PM

#

Okay. I hope I don't end up talking about the ones used in graphics. Cuz I know of homogeneous coordinate system, and something like that is used.

#

Okay, so it goes like this.

olive willow Jun 3, 2019, 5:24 PM

#

so linear algebra ??

stoic beacon Jun 3, 2019, 5:25 PM

#

Yes

sand reef Jun 3, 2019, 5:25 PM

#

Given a transformation.

olive willow Jun 3, 2019, 5:25 PM

#

I'm also there

sand reef Jun 3, 2019, 5:25 PM

#

If it preserves addition and scalar multiplication, it's a linear transformation.

stoic beacon Jun 3, 2019, 5:26 PM

#

Makes sense. So how do you calculate a transformation

sand reef Jun 3, 2019, 5:26 PM

#

It is a matrix transformation

#

Here, I'll give you a link.

#

https://textbooks.math.gatech.edu/ila/linear-transformations.html

olive willow Jun 3, 2019, 5:31 PM

#

so @stoic beacon you want to know how you can calculate where a vector would land after a linear transformation if you know where I and J hat land?

stoic beacon Jun 3, 2019, 5:31 PM

#

Yep. I think I understand I just want to make sure

#

Also, feels bad that a 14 year old understands this better than me pepe

olive willow Jun 3, 2019, 5:32 PM

#

hahahaha

#

so go the the 3b1b vid to 4:10

#

you see I and J hat

#

J[0,1]

#

I[1,0]

stoic beacon Jun 3, 2019, 5:33 PM

#

Yep

olive willow Jun 3, 2019, 5:33 PM

#

the new vector what we call vector V is on [-1,2]

#

so V[-1,2]

stoic beacon Jun 3, 2019, 5:33 PM

#

Makes sense

#

Then he rotates the plane