#data-science-and-ml | Python | Page 200

olive willow Jun 3, 2019, 5:33 PM

#

then go to 4:21

stoic beacon Jun 3, 2019, 5:33 PM

#

Yep I'm there

olive willow Jun 3, 2019, 5:33 PM

#

you see where I and J hat moved

#

right?

stoic beacon Jun 3, 2019, 5:34 PM

#

Yeah

#

Oh I see

olive willow Jun 3, 2019, 5:34 PM

#

so vector V = -1I + 2J

stoic beacon Jun 3, 2019, 5:34 PM

#

It's -1 * [1, -2]

olive willow Jun 3, 2019, 5:34 PM

#

yes

stoic beacon Jun 3, 2019, 5:34 PM

#

Alrighty I got it

olive willow Jun 3, 2019, 5:34 PM

#

and then 2 *

#

that is what you call a linear combination

#

here's the formula

stoic beacon Jun 3, 2019, 5:35 PM

#

Got it. That's what I thought. Just need to watch the next video now

olive willow Jun 3, 2019, 5:35 PM

#

aV + bA

#

a and b are scalars

#

V and A are vectors to be more precise I and J hat for example

#

you know what the coordinates are of I and J hat

#

nwm

#

do you know how to solve a linear equation ?

stoic beacon Jun 3, 2019, 5:37 PM

#

Not sure

olive willow Jun 3, 2019, 5:37 PM

#

ok so this is the equation:

#

2x - y = 0
-x + 2y = 3

stoic beacon Jun 3, 2019, 5:38 PM

#

Oh yeah duh

olive willow Jun 3, 2019, 5:38 PM

#

https://www.youtube.com/watch?v=ZK3O402wf1c

YouTube

MIT OpenCourseWare

Lec 1 | MIT 18.06 Linear Algebra, Spring 2005

Lecture 1: The Geometry of Linear Equations. View the complete course at: http://ocw.mit.edu/18-06S05 License: Creative Commons BY-NC-SA More information at ...

▶ Play video

stoic beacon Jun 3, 2019, 5:38 PM

#

I'll look at some other courses

#

Thanks

olive willow Jun 3, 2019, 5:38 PM

#

here you see how to use matrixes and how to solves equations up to Rn

#

yh sure but in my opinion this is the best one. I'm 14 and the guys can explain it so that even I know exactly what's up

sand reef Jun 3, 2019, 5:39 PM

#

What's up?

stoic beacon Jun 3, 2019, 5:39 PM

#

But it's MIT. I'm just concerned it'll be over my jead

#

Head

olive willow Jun 3, 2019, 5:39 PM

#

nope dude at least not the first course

#

I understood it fully

sand reef Jun 3, 2019, 5:40 PM

#

Anybody can learn through MIT

#

Even the advanced courses are made easier.

olive willow Jun 3, 2019, 5:40 PM

#

yup

stoic beacon Jun 3, 2019, 5:40 PM

#

Is this course on edx or something?

sand reef Jun 3, 2019, 5:41 PM

#

Come to India. Unless you're already in India, then you'll see what's bad explanation. They don't even want to teach and they are assigned as teachers.

olive willow Jun 3, 2019, 5:41 PM

#

hahahha lol

#

yt dude

sand reef Jun 3, 2019, 5:41 PM

#

Okay then. Habe phun.

stoic beacon Jun 3, 2019, 5:42 PM

#

It's not on edx or Coursera?

olive willow Jun 3, 2019, 5:43 PM

#

idk but I found it on YT

stoic beacon Jun 3, 2019, 5:43 PM

#

That works

#

Wish I had your initiative at your age lol

olive willow Jun 3, 2019, 5:47 PM

#

hhahaha

#

like people at my age in my school are just partying and stuff, me be like: I just like math and programming

desert oar Jun 3, 2019, 5:55 PM

#

i can't recommend strang enough

#

it's an excellent course for getting comfortable with matrices of real numbers

olive willow Jun 3, 2019, 5:55 PM

#

I'm using numpy to get familiar with matrixes and also solving linear equations

desert oar Jun 3, 2019, 5:56 PM

#

its good to do it on paper too

olive willow Jun 3, 2019, 6:01 PM

#

yh

#

I'm doing it on papaer

#

paper

#

not numpy but linear combies and equations

#

and also on my whiteboard

desert oar Jun 3, 2019, 6:03 PM

#

thats good

olive willow Jun 3, 2019, 6:03 PM

#

yup , after linear algebra, comes calc I think right?

#

for ML

stoic beacon Jun 3, 2019, 6:48 PM

#

@desert oar strange?

#

Strange*

#

Fucking autocorrect

#

You know what I mean

olive willow Jun 3, 2019, 7:21 PM

#

can someone give me an example of a 1d,2d,3d array?

misty sonnet Jun 3, 2019, 7:22 PM

#

@olive willow 1d is just a single "dimension" of a python list for example

#

So []

olive willow Jun 3, 2019, 7:22 PM

#

yh

misty sonnet Jun 3, 2019, 7:22 PM

#

2 dimensions means a list inside a list

olive willow Jun 3, 2019, 7:22 PM

#

then 2d is a list in a list

#

yh

misty sonnet Jun 3, 2019, 7:23 PM

#

So [[], []]

olive willow Jun 3, 2019, 7:23 PM

#

can there be one list

#

[[]]

#

or npt

#

not

misty sonnet Jun 3, 2019, 7:23 PM

#

Emm... I think so.

#

But that's pretty useless

#

And you already know what 3D is now.

olive willow Jun 3, 2019, 7:23 PM

#

but how would you represent a matrix?

#

like this?

#

[[4 6]
 [0 9]]

#

and so this is a 2d array of number otherwise known as a matrix right?

misty sonnet Jun 3, 2019, 7:24 PM

#

Well, you should likely use numpy for this

olive willow Jun 3, 2019, 7:25 PM

#

I did

#

😃

misty sonnet Jun 3, 2019, 7:25 PM

#

Ah ok great

olive willow Jun 3, 2019, 7:25 PM

#

but I get this

misty sonnet Jun 3, 2019, 7:25 PM

#

Yes. a matrix is a 2d array

olive willow Jun 3, 2019, 7:25 PM

#

import numpy as np
import matplotlib.pyplot as plt

Array = np.array([[4, 6], [0, 9]])
Array2 = np.matrix(Array)
print(np.ndim(Array))
print(Array2)

plt.plot(Array[0], Array[1])
plt.xlim(0, 10)
plt.ylim(0, 10)
plt.show()

misty sonnet Jun 3, 2019, 7:25 PM

#

But you need to make sure it's layed out correctly

olive willow Jun 3, 2019, 7:25 PM

#

2
[[4 6]
 [0 9]]

misty sonnet Jun 3, 2019, 7:25 PM

#

Yes. that's right

#

You have 2 prints there

#

What's wrong with it?

olive willow Jun 3, 2019, 7:26 PM

#

this is a 3d tensor

#

right?

misty sonnet Jun 3, 2019, 7:27 PM

#

No.

#

that's a 3 by 2 matrix

olive willow Jun 3, 2019, 7:27 PM

#

oohhh

#

what's a tensor then?

misty sonnet Jun 3, 2019, 7:27 PM

#

3D

#

So another layer of lists

olive willow Jun 3, 2019, 7:28 PM

#

so that is still 2d

misty sonnet Jun 3, 2019, 7:28 PM

#

Yep

#

There's only 2 layers of lists

olive willow Jun 3, 2019, 7:29 PM

#

[ [ [4, 6], [3, 8] ] [ [2, 6] [1,6] ] ]

#

this is 3 d

#

a 4 by 2 tensor right?

misty sonnet Jun 3, 2019, 7:30 PM

#

It's a...

#

Em...

#

Idek

#

It won't be x by y

#

It'll be x by y by z

olive willow Jun 3, 2019, 7:30 PM

#

so 3d

misty sonnet Jun 3, 2019, 7:30 PM

#

I think that's 2 by 4 by 2?

#

Yes.

olive willow Jun 3, 2019, 7:31 PM

#

nwm, I will learn how to write vectors and matrixes. this sht is hard

#

in programming. IRL it isn't that hard to do but in programming it's impossible

misty sonnet Jun 3, 2019, 7:32 PM

#

Haha

olive willow Jun 3, 2019, 7:32 PM

#

I'm 14 so I think I've some time hahahahah

misty sonnet Jun 3, 2019, 7:32 PM

#

Matrices are hard. Nevermind tensors

olive willow Jun 3, 2019, 7:32 PM

#

yup

misty sonnet Jun 3, 2019, 7:32 PM

#

So pick your battles

#

Definitely dude. Keep going tho;

olive willow Jun 3, 2019, 7:32 PM

#

then I choose vectors

#

hahah

misty sonnet Jun 3, 2019, 7:32 PM

#

It's impressive, keep up the euthusiasm

olive willow Jun 3, 2019, 7:32 PM

#

yh I know hardcore road to ML

#

sure dude thanks!

#

and let's not forget about indexing a matrix, not even talking about a tensor

stoic beacon Jun 3, 2019, 8:10 PM

#

So what in machine learning is represented by a 2d matrix?

#

Cuz I'm studying linear algebra and seeing this in Numpy I'm trying to apply it here

earnest prawn Jun 3, 2019, 8:31 PM

#

well every piece of data that requires more than one dimension like for example images have to be stored as 2d (or if you have rgb ones even 3d) data structures

#

and of course every transformation you apply to that piece of data also has to operate on those matrices

#

@stoic beacon

stoic beacon Jun 3, 2019, 8:56 PM

#

Thanks man

#

I can't decide if I want to use MLB stats to predict game outcomes or try the MNIST database. Both would be a NN

#

But if I go about looking for the MLB dataset, I wouldn't know what to look for or how to format the data. I know Kaggle exists but once I find a dataset there I get confused on what would be used for inputs or how to even decide what gets used for inputs

median siren Jun 3, 2019, 10:05 PM

#

Hi all, I've posted a similar question in the r/LearnMachineLearning discord, but hopefully someone here can help me out, too.

I'm trying to follow the following tutorial:

http://linanqiu.github.io/2015/10/07/word2vec-sentiment/

However, when i try to replace the vectors in numpy.zeroes with my own embeddings, I get the following error:

ValueError: setting an array element with a sequence.

Does anyone have any experience with this and / or how to solve this?

Sentiment Analysis Using Doc2Vec · Pandamonium

Sentiment Analysis Using Doc2Vec - Linan Qiu

reef bone Jun 3, 2019, 10:14 PM

#

The error is quite explicit, it tells you that you are trying to set an element of a numpy array to a sequence, rather than a single value (scalar)

#

It's hard to help more without seeing the code

lean ledge Jun 3, 2019, 10:15 PM

#

@stoic beacon MIT people learn the same stuff people from other universities do. They're not gods

#

I use online courses for MIT or similar for the majority of the stuff

#

The resources are just better

reef bone Jun 3, 2019, 10:16 PM

#

(I would assume that the error comes from you trying to set an element to a list or another array that holds the embeddings for a certain word)

stoic beacon Jun 3, 2019, 10:16 PM

#

Fair enough man. I always just assume MIT, Harvard, and the like are all harder since yaknow, they're for like...smart people and whatnot

#

Usually

#

Unless you have money but we won't go there

lean ledge Jun 3, 2019, 10:19 PM

#

They might go slightly more in depth but that depth is good and still accessible. Where they really shine is specialised highest level (fourth year and grad) courses that both exist and are well done

#

Eg. MIT's underactuated robotics or Stanford's CNNs for deep CV courses are not something that are easy to find elsewhere

frigid jacinth Jun 3, 2019, 11:09 PM

#

hello.... so there is a weird idea just popped up in my mind and I don't know which AI/machine learning libraries do I need if I want to achieve the following scenario...

Scenario 1 :
(The AI know nothing)
(The User's facebook account is public and has set the birthday)
AI:hi
User:I am John
AI:Hello, John
User:My facebook account is XXX
(the AI will now know the user's facebook)
AI: OK
User:How old am I?
(Then the AI will go to his facebook and search for the data)
AI: (give the answer)

Scenario 2 :
(The AI knows user is john now)
AI:Hello, John
User:What is the result of the barcelona vs liverpool on 8/5?
(The AI will now go search in google)
AI: Liverpool won and it is 4:0 (something like this)

Sorry...I know this kind of confusing...and thank you for trying to help me out...

desert oar Jun 3, 2019, 11:48 PM

#

i think thats basically what siri does

frigid jacinth Jun 4, 2019, 12:03 AM

#

Oh that's right..never thought of this before lol

#

thank you

stoic beacon Jun 4, 2019, 12:15 AM

#

So Keras comes with the MNIST dataset but if it didn't how would you load that?

#

Since it's images

#

Also, side note: is TensorFlow hard to learn?

stoic beacon Jun 4, 2019, 2:05 AM

#

Also also, since a vector typically represents magnitude and direction how do vectors relate to machine learning?

#

I assume they're not talking about the same thing

sand reef Jun 4, 2019, 2:22 AM

#

In ML they are basically column only matrices. Vectors are just one column of values in ML.

#

Why are the called vectors, because when you represent a vector in n-dimensions, you can write it as:
ai + bj + ck +...

#

So, you can instead of writing i, j, k,...
Write as a column matrix

#

With [[a] [b] [c]...]

#

If keras didn't come with the dataset loaded, you would have to download the dataset manually. Check sentdex. He has a video on how to load dataset. It's the second video of his new tensor flow, keras tutorial.

#

Tensor flow is lower level than keras, but it's not that hard.

#

@stoic beacon

stoic beacon Jun 4, 2019, 2:29 AM

#

Thanks for the responses bud

#

I'll give learning TF a shot

void anvil Jun 4, 2019, 3:22 AM

#

Can anyone explain why autoencoders are so popular compared to all the other models?

#

From what I've seen, it's not all that great in practice

sand reef Jun 4, 2019, 3:34 AM

#

Well, it captures representations pretty well.

#

Reduces the noise to a minimum, if not even removing it.

#

And is able to output as close as to the original input.

lean ledge Jun 4, 2019, 3:42 AM

#

@void anvil Depends on what they're for. For vision-y tasks, there's a lot more detail involved in making good images so autoencoders on their own don't work well but they're a pretty simple and cool way of reducing the dimensionality of your data to a much more dense representation

lapis sequoia Jun 4, 2019, 4:05 AM

#

autoencoders are now used extensively in NLP tasks too.. look up BERT..

#

for capturing context aware representations of words.. ergo word to sentence embeddings

lapis sequoia Jun 4, 2019, 4:23 AM

#

also.. dont mind me making nlp sound sexy.. it's not.. it's mostly mind numbing work and lot of Lisp :v.. I should've stuck to image processing..

lean ledge Jun 4, 2019, 4:24 AM

#

Image processing is the fun stuff

#

Signals turn me on

lapis sequoia Jun 4, 2019, 4:33 AM

#

Im trying to make sense of some numbers

#

consider I computed correlation of x vs a set (t, u , v, y, z )

lapis sequoia Jun 4, 2019, 4:51 AM

#

what I stated above was cosine similarity..but apparently it's the same as pearson correlation coefficient

#

for centered vectors..

silent swan Jun 4, 2019, 6:11 AM

#

naw NLP is great

karmic geyser Jun 4, 2019, 7:42 AM

#

anyone have experience with lowpass/highpass/bandpass filters on digital audio samples?

lean ledge Jun 4, 2019, 7:48 AM

#

@karmic geyser me

#

why

#

Spent this entire semester + possibly the next one if I take the DSP elective

#

You've been typing for a while 👀 Scared of how long the question might be

karmic geyser Jun 4, 2019, 7:55 AM

#

I'm using sounddevice in python to get audio input then output it with low latency. I want to turn a stereo audio input into 5 or 6 channels which I then will output to a subwoofer, midrange speakers and then finally tweeters. pretty much I'm trying to do a 3 way crossover in software. for an example say I have an audio stream at 44.1khz sampling rate and I have 1024 samples in an array would I need to add some kind of delay of like 30 samples or so. If I wanted to reduce the volume of everything below 3500hz by like 6db an octave? Also what books/online would you reccomend to do basic stuff like butterworth filter.

#

@lean ledge

lean ledge Jun 4, 2019, 7:58 AM

#

Why do you feel you need a delay of 30 samples? For notes, I would study from MIT's 6.007 Signals and Systems, lecture notes here https://ocw.mit.edu/resources/res-6-007-signals-and-systems-spring-2011/lecture-notes/, in particular, butterworth filters here https://ocw.mit.edu/resources/res-6-007-signals-and-systems-spring-2011/lecture-notes/MITRES_6_007S11_lec24.pdf

karmic geyser Jun 4, 2019, 8:00 AM

#

The delay would be so when I go from 1 chunk of samples to the next chunk the filter would still be smooth.

#

30 samples was arbritary but I don't know how many samples a normal filter would use.

#

maybe not so much a delay but memory of the last 30 samples or 30 processed samples given to the filter for each channel.

#

Thanks for the lecture notes, they are quite good. I don't have much experience with filters or reading and understanding university level math. I tend to understand math better if it's written in a programming language.

lean ledge Jun 4, 2019, 8:10 AM

#

When you're done with signal processing basics there's https://ocw.mit.edu/resources/res-6-008-digital-signal-processing-spring-2011/ for focus on digital signals and then https://www.coursera.org/learn/audio-signal-processing for focusing on audio signals

MIT OpenCourseWare

Digital Signal Processing

This course was developed in 1987 by the MIT Center for Advanced Engineering Studies. It was designed as a distance-education course for engineers and scientists in the workplace. Advances in integrated circuit technology have had a major impact on the technical areas to whic...

Coursera

Audio Signal Processing for Music Applications | Coursera

Learn Audio Signal Processing for Music Applications from Universitat Pompeu Fabra of Barcelona, Stanford University. In this course you will learn about audio signal processing methodologies that are specific for music and of use in real ...

#

I'm not sure I can help you with audio processing chunks because while i've done a bunch of signal processing, it has been in context of signal theory rather than the details of specific implementations but I believe you're looking for techniques involving Hann and Hamming windowing functions and hen merging on top

karmic geyser Jun 4, 2019, 8:14 AM

#

wouldn't that stuff be more for showing a spectrogram?

#

the windowing functions?

lean ledge Jun 4, 2019, 8:15 AM

#

I'll just say that if you use a technique such as IIR filtering rather than something IIR based (like butterworth filters), you might be able to ge decent filtering without worrying about dealing with merging the output of separate buffer frames

#

Uh what do you mean?

karmic geyser Jun 4, 2019, 8:17 AM

#

I feel like you would pass your samples through a windowing function and it would let you know how much activity is going on at a certain frequency bandwidth. and you would just repeat that say 1024 times with different bandwidths and use the output to make a spectrogram.

lean ledge Jun 4, 2019, 8:18 AM

#

What exactly are you trying to do?

#

Create a spectogram or use outpu to drive audio?

karmic geyser Jun 4, 2019, 8:19 AM

#

lower the volume of frequencies below 3500hz on a continuous stream of digital audio samples.

#

with a curve so the lower the frequency the lower its volume is.

lean ledge Jun 4, 2019, 8:24 AM

#

@karmic geyser To skip the theory for you, construct a butterworth filter with the parameters you want (it's simple with scipy), then take advantage of the fact that butterworth is IIR and use the zf value that the lfilter function returns after you use a filter and pass that in into the next filter operation when you run the next batch

karmic geyser Jun 4, 2019, 8:25 AM

#

I tried that but it didn't seem to be working. It was like it was just lowering the volume of the entire frequency spectrum.

lean ledge Jun 4, 2019, 8:25 AM

#

Or you can use a pre-built system like GNU Radio with Python to set up the streaming architecture for yourself

#

@karmic geyser Every filter will reduce the volume to some extent

#

it shouldnt be by a lott

#

should be disproportionate

#

very disproportionate

#

you can always reamplify by multiplying by a constant as long as the wrong frequencies are filtered out

#

if they're not filtered out, there's probably something wrong with your parameters

karmic geyser Jun 4, 2019, 8:27 AM

#

let me just quickly upload the code I have. It was ment to be a 6 order butterworth but it was making the signal inaudible. if I multiplied the signal by like 2048 I could hear it again it mostly seemed to be the same frequencies but with some distortion from compacting and expanding the samples.

lean ledge Jun 4, 2019, 8:28 AM

#

How did you get the parameters?

#

You can use something like PyFDA to come up with the perfect filter https://github.com/chipmuenk/pyFDA

GitHub

chipmuenk/pyFDA

Python Filter Design Analysis Tool. Contribute to chipmuenk/pyFDA development by creating an account on GitHub.

#

v good for filter design, I love it

#

Anyways, i'm dead tired from studying for my signals course, this isnt helping much :p I'm gonna go take a rest

#

Good luck!

#

hopefully the resources I linked can help a bit

karmic geyser Jun 4, 2019, 8:32 AM

#

Oh okay. this is my code.

#

https://gist.github.com/Sartek/63fae42be038ea5fe867ac4caebf1c6b

Gist

3 way crossover

3 way crossover. GitHub Gist: instantly share code, notes, and snippets.

#

line 10-16 is the filter values 18-22 applies the filter, 53-63 is where I actually pass the data

foggy bridge Jun 4, 2019, 11:27 AM

#

Hello everyone

#

i have a question regarding panda

#

whats the best source to learn?

lyric canopy Jun 4, 2019, 11:27 AM

#

There are links to tutorials in the official documentation: https://pandas.pydata.org/pandas-docs/stable/getting_started/tutorials.html

#

There's also a 10-minutes to Python tutorial in the official documentation of Pandas

foggy bridge Jun 4, 2019, 11:29 AM

#

thank you @lyric canopy

stoic beacon Jun 4, 2019, 12:30 PM

#

When using Colab, where do you save CSV files to be read in

olive willow Jun 4, 2019, 1:15 PM

#

yo guys!

stoic beacon Jun 4, 2019, 1:17 PM

#

Morning

olive willow Jun 4, 2019, 1:17 PM

#

howdy?

stoic beacon Jun 4, 2019, 1:17 PM

#

Eh no

#

Don't say that

olive willow Jun 4, 2019, 1:17 PM

#

hahahah hwry?

stoic beacon Jun 4, 2019, 1:18 PM

#

Cuz you're not a cowboy lol

olive willow Jun 4, 2019, 1:18 PM

#

yh I know 😦

#

hahaha

stoic beacon Jun 4, 2019, 1:18 PM

#

So stick to calculus

#

Damn whippersnappers

olive willow Jun 4, 2019, 1:19 PM

#

hahaha

lapis sequoia Jun 4, 2019, 1:35 PM

#

🤠

#

@stoic beacon depending on how big it is, you can save it locally or on cloud storage..

stoic beacon Jun 4, 2019, 1:47 PM

#

Awesome thanks

olive willow Jun 4, 2019, 6:16 PM

#

guys on what do you need calc in data science? just curious

desert oar Jun 4, 2019, 6:47 PM

#

optimization

#

understanding and computing gradients is really important

#

understanding at least how and why convex optimization works is important

#

also finite series come up a lot, that's usually covered in calc courses even if it's not strictly calculus

craggy geyser Jun 4, 2019, 7:49 PM

#

quick question: for pandas, I have a dataframe where I have a timestamp column, an ID column, and another column category. In some cases, three rows can have the same ID and timestamp, but three different categories. Is there an easy way to drop all rows where this happens except one of them?

#

follow-up: This is not important, it doesn't really matter which row I keep since multiples is a good sign, but I have a 4th column snr which I could use to select which one to keep, i.e. keep the row with the highest snr value

polar acorn Jun 4, 2019, 8:07 PM

#

@craggy geyser
Take a look at the following code (copy pasted from stackoverflow). It creates a dummy df and drops all rows where the values in the A and C column are not unique. It keeps the first non unique row.

import pandas as pd
df = pd.DataFrame({"A":["foo", "foo", "foo", "bar"], "B":[0,1,1,1], "C":["A","A","B","A"]})
df = df.drop_duplicates(subset=['A', 'C'], keep='first')

#

If you want to keep the row with the highest snr value and you don't mind changing the order of your df you can sort on snr to begin with before dropping.

craggy geyser Jun 4, 2019, 8:11 PM

#

ah, I see, by feeding it the columns, It will drop duplicates where the pair of those two columns are the same. That makes sense, and when you write here now I think I actually have done this in the past, should have remembered

#

and yes, that makes sense with snr of course

#

thanks!

#

👍🏼

polar acorn Jun 4, 2019, 8:12 PM

#

np 😃

olive willow Jun 4, 2019, 8:13 PM

#

so yh I'm learning data science and am thinking about buying a course, do they really cover at least the most of the stuff you really need to know?
because I'm thinking of buying the datacamp subscription
is it any good or are there better courses

desert oar Jun 4, 2019, 8:33 PM

#

why not do a free one?

#

like fast.ai

#

that's machine learning focused, but machine learning is a fine place to start nowadays for more general data science

silent swan Jun 4, 2019, 8:40 PM

#

isn't fast.ai significantly deep-learning focused?

#

(I think it's good, but not sure if it's the best recommendation for general data science.)

desert oar Jun 4, 2019, 8:41 PM

#

yeah it is

#

hes also 14 😛

#

if you already know how to code, i don't see a problem with starting with deep learning

silent swan Jun 4, 2019, 8:41 PM

#

oh, yeah then disregard data science, acquire AGI skillz

desert oar Jun 4, 2019, 8:41 PM

#

especially if you're doing it as a hobby

#

if you don't fall into the "arrogant AI guy" trap then you should be fine transitioning into general data science

silent swan Jun 4, 2019, 8:42 PM

#

deep learning would've been such a blast if I could've started earlier

desert oar Jun 4, 2019, 8:42 PM

#

you can learn probability and stats later one you know the math and coding

silent swan Jun 4, 2019, 8:42 PM

#

instead I was learning javascript before javascript became good

desert oar Jun 4, 2019, 8:42 PM

#

heh

lean ledge Jun 4, 2019, 8:42 PM

#

Deep learning should be learnt after ML

#

For one, deep learning is mostly useless and bad

lapis sequoia Jun 4, 2019, 8:43 PM

#

What’s deep learning for

desert oar Jun 4, 2019, 8:43 PM

#

Facebook begs to differ @lean ledge

lean ledge Jun 4, 2019, 8:43 PM

#

For another, it's easier to learn how to treat it like any other model when you know how other models work

silent swan Jun 4, 2019, 8:43 PM

#

I don't think it's mostly useless and bad if you use it on places where it's clearly good at. But don't use it to predict sstock prices

desert oar Jun 4, 2019, 8:43 PM

#

facebook, google, openai, et al

lean ledge Jun 4, 2019, 8:43 PM

#

How so?

#

Research ≠ practice

#

I am very very aware of ML research, I assure you

#

Deep learning excels in a few tasks but in practice as a data scientist, you almost never use deep learning

desert oar Jun 4, 2019, 8:44 PM

#

of course

silent swan Jun 4, 2019, 8:44 PM

#

hence why disregard data science, acquire AGI skillz :p

desert oar Jun 4, 2019, 8:44 PM

#

i literally never use it

#

does that make it useless and bad?

lean ledge Jun 4, 2019, 8:45 PM

#

Deep learning is the way to do CV and NLP but apart from that there's few uses for it

desert oar Jun 4, 2019, 8:45 PM

#

which are huge problem domains right now

silent swan Jun 4, 2019, 8:45 PM

#

actually though, as a 14 year old, deep learning will be much more fun/better as a hobby than learning how to do pivot tables

desert oar Jun 4, 2019, 8:45 PM

#

at least 50% of the data science jobs i see are either CV or NLP or audio related

silent swan Jun 4, 2019, 8:45 PM

#

if this were a first college course I'd say yea go learn some statistics first

lean ledge Jun 4, 2019, 8:45 PM

#

👀👀👀 we must be seeing very different jobs

desert oar Jun 4, 2019, 8:45 PM

#

unstructured data is the big data of 2019

#

its a fad in some regards

#

but in others its a genuine big step forward

silent swan Jun 4, 2019, 8:46 PM

#

but if he's going to have fun with CycleGANs and make cool pixelated pokemon recolors I say go do that

desert oar Jun 4, 2019, 8:46 PM

#

^^

#

also theres no point being hyperbolic and inflammatory

lean ledge Jun 4, 2019, 8:46 PM

#

CV and NLP are a minority of data science jobs and they require a large large amount of specialisation for the average job

#

You basically have to spend an year learning just deep vision after having already studied other DL and ML stuff in order to catch up on SOTA

desert oar Jun 4, 2019, 8:47 PM

#

thats fair

reef bone Jun 4, 2019, 8:47 PM

#

Where are you looking that half the data science jobs you see are CV or NLP or audio related?

desert oar Jun 4, 2019, 8:47 PM

#

my recommendation was targeted at a bright kid who's already good at programming and math, and wants a place to get started

#

@reef bone maybe in the wrong places

lean ledge Jun 4, 2019, 8:47 PM

#

Yeah I rarely ever see a CV or NLP job lol

reef bone Jun 4, 2019, 8:47 PM

#

I'm genuinely wondering because I rarely see those at all

desert oar Jun 4, 2019, 8:47 PM

#

anyway i wouldnt have made that recommendation to anyone else

lean ledge Jun 4, 2019, 8:48 PM

#

The few CV jobs I see are specialised robotics related jobs

silent swan Jun 4, 2019, 8:48 PM

#

it's like telling a kid "Don't learn javascript, start with learning big O and data structures"

desert oar Jun 4, 2019, 8:48 PM

#

or "dont learn C++ learn python instead"

#

it was just a recommendation

reef bone Jun 4, 2019, 8:48 PM

#

DL is an approach to ML, I don't think you can really learn DL without ML

desert oar Jun 4, 2019, 8:48 PM

#

and frankly im only resisting this at all because your tone was confrontational

#

unnecessarily so imo

silent swan Jun 4, 2019, 8:49 PM

#

tl;dr use CycleGANs to make new pokemon sprites, but also read Murphy

desert oar Jun 4, 2019, 8:49 PM

#

^

lean ledge Jun 4, 2019, 8:49 PM

#

I definitely think classic ML should be learnt before DL. It makes people too comfortable trying to use DL because that's what they're used to. It builds weak foundations in ML to start at DL.

desert oar Jun 4, 2019, 8:49 PM

#

i agree, for anyone over 14

silent swan Jun 4, 2019, 8:49 PM

#

I 100% agree for people getting serious in the topic

#

I disagree for a hobbyist wanting to pick up something new and cool

desert oar Jun 4, 2019, 8:50 PM

#

its like saying to learn what a hash table is before using turtle graphics or pyqt5

lean ledge Jun 4, 2019, 8:50 PM

#

I s2g GAN SOTA changes faster than the hottest JS frameworks

desert oar Jun 4, 2019, 8:50 PM

#

you literally dont need to know

#

who ever said SOTA

silent swan Jun 4, 2019, 8:50 PM

#

lol feel like we're talking past each other at this point

desert oar Jun 4, 2019, 8:50 PM

#

you still learn about probabilities, objective functions, et al

#

doing mnist

lean ledge Jun 4, 2019, 8:51 PM

#

(I was not referring to learning SOTA, just joking about the GAN hype)

desert oar Jun 4, 2019, 8:51 PM

#

anyway if you have a recommendation for a free data science course that isnt fast.ai, i'm sure the person who asked the question originally would appreciate the recommendation

#

and i would too, so i can recommend to others

silent swan Jun 4, 2019, 8:52 PM

#

oddly enough, Jeremy Howard probably would have been great for a datascience course

#

afaik that's his background

polar acorn Jun 4, 2019, 8:53 PM

#

Honestly though @lean ledge the all caps nick is making you look more upset about this then you probably are 🤔

lean ledge Jun 4, 2019, 8:53 PM

#

There are many other than fast.ai. Andrew Ng's course, Columbia's ML course (my preference), Google and Microsoft have their own free ones, etc.

#

Probably tbh

silent swan Jun 4, 2019, 8:54 PM

#

when people talk about Andrew Ng's course, are they still talking about the coursera one? or deeplearning.ai

#

honestly even pre-DL boom, I never liked his coursera course

lean ledge Jun 4, 2019, 8:54 PM

#

I generally assume coursera unless proven otherwise

#

I didn't like it either

#

It's too shallow

#

And too much "don't worry if you don't understand" going on

silent swan Jun 4, 2019, 8:54 PM

#

also octave lol

lean ledge Jun 4, 2019, 8:55 PM

#

That too

silent swan Jun 4, 2019, 8:55 PM

#

my main gripe with fast.ai is that that group is incredibly self-promotional

#

the content is solid though (if a little bit loosey-goosey)

desert oar Jun 4, 2019, 8:56 PM

#

i didnt like the coursera course either

#

not only "dont worry about it"

#

but also octave 🤢

silent swan Jun 4, 2019, 8:57 PM

#

3/3 surveyed people hate the coursera course lol

desert oar Jun 4, 2019, 8:57 PM

#

and teaching linear regression with gradient descent was always weird to me too

silent swan Jun 4, 2019, 8:57 PM

#

yes!

lean ledge Jun 4, 2019, 8:57 PM

#

n o r m a l E q U a t I O n

polar acorn Jun 4, 2019, 8:57 PM

#

I liked it but then again I already had a background in statistics and maths and learned to program in Matlab so maybe it was intended for me.

olive willow Jun 4, 2019, 8:58 PM

#

I'm back

#

wooww

silent swan Jun 4, 2019, 8:58 PM

#

we've concluded that you need to read SICP but also He et al. 2015

desert oar Jun 4, 2019, 8:58 PM

#

i didnt know columbia had a free ML course

lean ledge Jun 4, 2019, 8:58 PM

#

It was too boring for me. Not Mathy enough. Columbia's course was much nicer and when I looked back on it, I realised the topics were so practical. Stuff I use or see being used all the time in real world DS

desert oar Jun 4, 2019, 8:58 PM

#

i wonder what a more general purpose "data science" course would look like

#

vs a "ML" course which is what i normally see

silent swan Jun 4, 2019, 8:59 PM

#

data science is whatever you want it to be

desert oar Jun 4, 2019, 8:59 PM

#

well sure. i assume it'd spend more time talking about probability and stats, as well as data visualization

silent swan Jun 4, 2019, 8:59 PM

#

domain-specific business logic? sure!

#

hardcore data-engineering principles? why not!

desert oar Jun 4, 2019, 8:59 PM

#

heh

#

i'd assume that an "intro to data science" course is like 1/2 ML and 1/2 stats

silent swan Jun 4, 2019, 9:00 PM

#

ELBo for building VAEs? throw it in!

desert oar Jun 4, 2019, 9:00 PM

#

maybe start with stats and do ML at the end once they're a little more comfortable with the math and coding

lean ledge Jun 4, 2019, 9:00 PM

#

Look at how good that syllabus is!

📎 Screenshot_20190605-065958_Chrome.jpg

desert oar Jun 4, 2019, 9:00 PM

#

which syllabus is this, columbia?

lean ledge Jun 4, 2019, 9:00 PM

#

Yah

desert oar Jun 4, 2019, 9:00 PM

#

that's pretty comprehensive

#

and fast moving

#

what are the pre-requisistes?

#

it's free online? that's pretty sweet

silent swan Jun 4, 2019, 9:01 PM

#

that's more ML than DS though?

lean ledge Jun 4, 2019, 9:01 PM

#

DS is basically ML though? + Domain knowledge and blah blah hype words

desert oar Jun 4, 2019, 9:01 PM

#

sorta, ML+Stats

#

you actually have to talk to the business people

#

at least that's my expectation when i see "data science"

silent swan Jun 4, 2019, 9:02 PM

#

I do actually think some level of data engineering should be in data science

desert oar Jun 4, 2019, 9:02 PM

#

like what

#

basic principles of indexing a database or something like that?

silent swan Jun 4, 2019, 9:03 PM

#

databases, mapreduce

desert oar Jun 4, 2019, 9:03 PM

#

does anyone actually write low level map reduce stuff though nowadays

#

people just use spark

#

unless you truly have enormous data

silent swan Jun 4, 2019, 9:03 PM

#

would still be good to know the underlying principles though

desert oar Jun 4, 2019, 9:03 PM

#

oh, the concept? yeah definitely

lean ledge Jun 4, 2019, 9:03 PM

#

But yeah it's a good course. Builds good fundamentals in the first 6 weeks and then goes over good foundations for related stuff like drawing out true latent factors through matrix factorisation and PCA, Markovian models, continuous state space extension to those, etc

desert oar Jun 4, 2019, 9:03 PM

#

i think that more falls under general programming skills than data engineering though

#

yeah, ill go look over the material at some point and start directing people there who ask

#

thanks

silent swan Jun 4, 2019, 9:04 PM

#

arguably even (very practical) things like database integrity when you have parallel requests
don't have to know how DBs actually handle them, but you need to know that it's an issue that people have to think about

#

I would argue that data science should cover how people have to handle data

#

of course, there're different perspectives

#

like my bayesian stat friends who treat deep learning as just "function approximators"

#

which isn't wrong

desert oar Jun 4, 2019, 9:10 PM

#

im one of those people i think 😛

#

i also think that data engineering generally can be learned on the job

#

and i think a good org will deliberately get you to push your limits in that regard

olive willow Jun 4, 2019, 9:12 PM

#

btw guys do you need like linear algebra and calc do start learning ML and understanding it ?

desert oar Jun 4, 2019, 9:13 PM

#

you can start, but you won't get that far

#

you might also build up some bad habits and mistaken ideas without knowing how it works

#

for understanding it, they are necessary

olive willow Jun 4, 2019, 9:20 PM

#

so even before using linear regression, I should understand the formula

#

sht

#

that will take hella a lot of time

#

I'm on linear transformation rn

#

and calc == noting yet

#

so it will take like 2 years right? around that

#

to understand the a little bit more than the basics of ML

polar acorn Jun 4, 2019, 9:30 PM

#

My 2 cents, at 14 I would play around with what I found enjoyable. If you put up a rigorous schedule for learning DL and/or ML and all related fields, you might find you are sick of it after two weeks and then do something else. Playing around and learning stuff in a sub optimal way is always better than giving up on learning it the right way. Playing around with a conv net for CV even if you don't understand everything that's going on, is much better learning then reading the intro chapter of some advanced calc book and then putting the rest of it away forever. If you enjoy it you'll find yourself learning the whole field soon enough. So first of all know yourself and then figure out what you want to learn.

olive willow Jun 4, 2019, 9:33 PM

#

I enjoy it that's why I'm so enthusiast about it and want to learn it.

#

I'm repeating it every day because, it might sound weird to some people but I love math and things that make sense like chopping a circle down and then putting it into a graph and getting the cm^2 that way for example

#

in stead of using the normal pi r^2

#

sry dude bye have to go to sleep now, will read it tomorrow !

stoic beacon Jun 4, 2019, 9:42 PM

#

All of this has hurt my head

#

And just the fact the it's so vast and there's so much to know I think I'm just done lol

#

It's too much math that's way over my head and spending 2-3+ years to learn something to just the point of basic understanding is just not my idea of a hobby lol

desert oar Jun 4, 2019, 10:04 PM

#

@olive willow no for basic 1 variable linear regression you can get by with basic calculus if you want to really understand

#

you can skip that honestly

#

this is why people usually go to school for years when doing this stuff..

stoic beacon Jun 4, 2019, 10:55 PM

#

@desert oar what about me bruh

#

I need wisdoms too

#

Where can I best understand the high level principles behind ML and the algorithms involved? Not trying to become a math wizard or scholar or anything

desert oar Jun 4, 2019, 11:12 PM

#

what do you already know, and what are you trying to do with what you learn?

#

Just get a better understanding so you can follow the news and not be completely lost?

#

Or fit basic models?

stoic beacon Jun 4, 2019, 11:13 PM

#

Fit basic models sir

#

Maybe do some fun things with some Kaggle data or work related data

desert oar Jun 4, 2019, 11:13 PM

#

Ok

#

Im really not the best one to ask, i dont know too many resources

#

kaggle has their own tutorial content but its very limited

stoic beacon Jun 4, 2019, 11:14 PM

#

Yeah I saw :(

#

Oh well

#

I'll just keep gaining bits of info here and there as I watch things

sand reef Jun 5, 2019, 6:14 AM

#

Say. Is reservoir computing still a big thing? Or it never was?

karmic geyser Jun 5, 2019, 8:46 AM

#

I have some python code and I need to do a lot of maths on a lot of data inside a function in almost real time. How would I normally go about making it faster? I have already rewritten most of it with performance in mind as well as running multiple of the function on different signals in seperate threads/processes. This is what line profiler says about the function. ignore the thing about bandpass as it's not actually a bandpass yet.

#

📎 unknown.png

#

It's currently using about 3.3 ghz of total cpu usage and I need to try get it to under 1ghz

#

would you use something like cython? I haven't used it before.

sand reef Jun 5, 2019, 9:11 AM

#

Well. About cpython.

#

https://stackoverflow.com/questions/17130975/python-vs-cpython

Stack Overflow

Python vs Cpython

What's all this fuss about Python and CPython (Jython,IronPython), I don't get it:

python.org mentions that CPython is:
The "traditional" implementation of Python (nicknamed CPython)
yet another

#

Apparently PyPy, not PyPy3, is super fast. Faster than cpython.

#

https://hackernoon.com/which-is-the-fastest-version-of-python-2ae7c61a6b2b

Hacker Noon

Which is the fastest version of Python?

Of course, “it depends”, but what does it depend on and how can you assess which is the fastest version of Python for your application?

#

@karmic geyser

zenith nova Jun 5, 2019, 9:19 AM

#

Cython was probably the thing that was meant

karmic geyser Jun 5, 2019, 9:19 AM

#

Hey sorry I didn't have it open.

#

I could maybe use pypy instead of cpython. But I don't mean using a different interpreter I mean having a single function written in C or C++ code and called from python.

sand reef Jun 5, 2019, 9:22 AM

#

Yeah. Cython is 3-4x times faster than pypy

#

So, I think, that might fix your issue.

karmic geyser Jun 5, 2019, 9:24 AM

#

Yeah I'm running into performance issues on my desktop pc and eventually want to run it on a 1ghz single core arm processor. most of the code is fast enough it's just a few filters that I will be applying to a big array of data that might need to be done in c++. How hard is it to use cython for a single function?

zenith nova Jun 5, 2019, 9:25 AM

#

It's designed for that, so hopefully not very hard

karmic geyser Jun 5, 2019, 9:26 AM

#

Okay I will give it a shot.

sand reef Jun 5, 2019, 9:27 AM

#

Like a tutorial?

#

https://pythonprogramming.net/introduction-and-basics-cython-tutorial/

Python Programming Tutorials

Python Programming tutorials from beginner to advanced on a massive variety of topics. All video and text tutorials are free.

karmic geyser Jun 5, 2019, 9:28 AM

#

Thanks I will have a look at that.

polar acorn Jun 5, 2019, 9:29 AM

#

@karmic geyser If you have time I've also heard nice things about https://github.com/pybind/pybind11 for when you need just one function running in c++. I've never tried it out myself though.

GitHub

pybind/pybind11

Seamless operability between C++11 and Python. Contribute to pybind/pybind11 development by creating an account on GitHub.

karmic geyser Jun 5, 2019, 9:30 AM

#

alright I will read about that too. I think I used ctypes for something in the past because I wanted to access some windows dll functions and python didn't have an interface.

desert oar Jun 5, 2019, 10:37 AM

#

Cython's main difficulty is sparse and IMO somewhat incoherent documentation

#

If you know C it's probably easier to learn

#

You're just trying to wrap a C++ function?

karmic geyser Jun 5, 2019, 10:39 AM

#

I have an algorithm in python but it's quite slow so I wanted to make just that algorithm function run faster.

#

I have about 23ms to run the algorithm on 1024 values. I think it would be a lot faster in c or c++.

desert oar Jun 5, 2019, 10:42 AM

#

If you share the code i can probably help

#

Chances are there is something you can do to improve performance without using cython

#

But yes, when I rewrite something in cython i usually get about a 50% performance improvement without doing much of anything other than copy and paste

olive willow Jun 5, 2019, 10:43 AM

#

Yo dude

desert oar Jun 5, 2019, 10:43 AM

#

How long does it take currently?

olive willow Jun 5, 2019, 10:45 AM

#

You can use libs to try make it faster ? Like numpy array instead of list

karmic geyser Jun 5, 2019, 10:46 AM

#

I already made some changes. Instead of using a circular buffer for some values I only store the last values of the chunk for the next one. I'm also running the algorithm on 2 different cores but that won't help when I move it to the embedded device. at the moment it uses up about 70% of a 4.5ghz cpu core. and I need to run it on a 1ghz arm cpu.

desert oar Jun 5, 2019, 10:47 AM

#

Well unless you post the code nobody can help

#

Are you operating on images? Text? Etc.

#

What do you mean by a buffer?

#

You're trying to iterate over something in chunks?

karmic geyser Jun 5, 2019, 10:47 AM

#

Audio data. maybe I ment ring array. let me post the code + the profiler

desert oar Jun 5, 2019, 10:48 AM

#

Thanks, it will just be much easier to assess the situation that way

karmic geyser Jun 5, 2019, 10:55 AM

#

https://gist.github.com/Sartek/ff54906baedc01bf0fda999584788c5b

Gist

FILTER

FILTER. GitHub Gist: instantly share code, notes, and snippets.

desert oar Jun 5, 2019, 10:56 AM

#

OK, I think I have some performance improvements we can make once I get to a computer

karmic geyser Jun 5, 2019, 11:03 AM

#

The algorithm at the moment pretty much just acts kinda like a low pass filter. I will need a few different algorithms but I'm still learning how to implement them. Python seems to be kind of slow and I'm not really sure how to optimise python code, with other python stuff I have done it hasn't been a problem but I think that was because libaries I used had the heavy stuff implemented in c or c++. I don't mind writing the actual algorithms in c++ or c I'm just not sure what is the best way to do that and then call it from python.

#

I think the algorithm I wrote might just be a moving average

desert oar Jun 5, 2019, 11:09 AM

#

Yeah. Also looping is slow because of lots of memory allocation and other overhead

karmic geyser Jun 5, 2019, 11:09 AM

#

yeah, I figured if I wrote it in c++ I could avoid most of that.

desert oar Jun 5, 2019, 11:10 AM

#

If youre using numpy for looping you might as well use a list

#

Ill take a look but yes. This might be a good candidate for cython

karmic geyser Jun 5, 2019, 11:12 AM

#

Some stuff I might need to do is multiply every value in the array by a constant which I think numpy helps with. The audio library I use for outputting and recording the samples gives a numpy array as well.

desert oar Jun 5, 2019, 11:12 AM

#

Thats a 1 liner in numpy and extremely efficient

#

x = np.array([1,2,3])
print(x)

y = x * 10
print(y)

karmic geyser Jun 5, 2019, 11:15 AM

#

'''
def stereotomono(left,right,gain):
left = left * gain
right = right * gain
mono = left + right

    return mono

'''

#

Woops, but I was doing that to multiply an array to lower or increase the volume and then I was summing the 2 arrays.

desert oar Jun 5, 2019, 11:16 AM

#

Yep that should work

#

In your algorithm you loop over bpos twice for every pass over samples

#

Oh nvm

#

Hah yeah this is a moving average isnt it

karmic geyser Jun 5, 2019, 11:18 AM

#

pretty much I'm adding a few of the previous samples to the start of the algorithm.

#

Yeah I think so.

#

📎 unknown.png

#

on the left is the effect it had on some music I was playing. the right is with the filter turned off.

desert oar Jun 5, 2019, 11:20 AM

#

https://stackoverflow.com/a/44797397/2954547 for a non numpy version

Stack Overflow

Moving average or running mean

Is there a scipy function or numpy function or module for python that calculates the running mean of a 1D array given a specific window?

#

Numpy version https://stackoverflow.com/a/14314054/2954547

Stack Overflow

How to calculate moving average using NumPy?

There seems to be no function that simply calculates the moving average on numpy/scipy, leading to convoluted solutions.

My question is two-fold:
What's the easiest way to (correctly) implement a

#

The comments on the numpy answer are enlightening as well

karmic geyser Jun 5, 2019, 11:23 AM

#

Pretty much I want to get an audio signal and turn it into 3 audio signals. 1 that is lowpass below 150 hz. 1 that is bandpass of 150-3500hz. and 1 that is 3500hz highpass.

#

I tried using scipy butterworth filter but the tutorials were not really clear and it didn't seem to work correctly. it was just making everything quiet.

desert oar Jun 5, 2019, 11:53 AM

#

can you share your scipy code?

#

the example here looks straightforward enough to me

#

https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.butter.html

#

from scipy.signal import butter, sosfilt

freq_lo = 150
freq_hi = 3500
filter_order = 6
sos1 = butter(filter_order, freq_lo, 'lowpass')
sos2 = butter(filter_order, (freq_lo, freq_hi), 'bandpass')
sos3 = butter(filter_order, freq_hi, 'highpass')

def filter3(y):
    return sosfilt(sos1, y), sosfit(sos2, y), sosfit(sos3, y)

#

then you'll have to play around with the cutoffs and order in order to get the response to look how you want

#

import matplotlib.pyplot as plt
from scipy.signal import butter, freqs
from collections import Iterable

def plot_butter(N, Wn, btype):
    b, a = butter(N, Wn, btype, analog=True)
    w, h = freqs(b, a)
    plt.semilogx(w, 20 * np.log10(np.abs(h)))

    plt.title('Butterworth filter frequency response')
    plt.xlabel('Frequency (rad / sec)')
    plt.ylabel('Amplitude (dB)')

    plt.margins(0, 0.1)
    plt.grid(which='both', axis='both')

    if isinstance(Wn, Iterable):
        for w in Wn:
            plt.axvline(w)
    else:
        plt.axvline(Wn)

plot_butter(6, 150, 'low')
plt.show()

karmic geyser Jun 5, 2019, 12:11 PM

#

    #I think this generates the values that the filter will use.
    def butter_bandpass(lowcut, highcut, fs, order=5):
        nyq = 0.5 * fs
        low = lowcut / nyq
        high = highcut / nyq
        sos = butter(order, [low, high], analog=False, btype='band', output='sos')
        return sos
    
    #I think this applies the filter to the data.
    def butter_bandpass_filter(data, lowcut, highcut, fs, order=5):
        sos = butter_bandpass(lowcut, highcut, fs, order=order)
        y = sosfilt(sos, data)
        return y

desert oar Jun 5, 2019, 12:12 PM

#

what's the nyquist frequency? i don't know much of anything about signal processing

karmic geyser Jun 5, 2019, 12:12 PM

#

fs was 44100, lowcut was 150 and highcut was 3500

desert oar Jun 5, 2019, 12:13 PM

#

fs = frequency sampling rate?

karmic geyser Jun 5, 2019, 12:13 PM

#

nyquist frequency is half of the sample rate. It's pretty much what a sample rate can have a sine wave up to.

desert oar Jun 5, 2019, 12:13 PM

#

so that code looks right

#

do you have a sample signal we can test on?

karmic geyser Jun 5, 2019, 12:14 PM

#

I had the order at 6 but it was making the entire song quiet. if I turned the order lower it just seemed to be making it slightly less soft. if I multiplied it by like 16384 it ended up sounding close to the original but with some distortion from compressing and then expanding the samples I think.

desert oar Jun 5, 2019, 12:15 PM

#

so your data is sampled how many times / second?

#

just so i understand what's going on here

#

44100?

karmic geyser Jun 5, 2019, 12:15 PM

#

44100 times

desert oar Jun 5, 2019, 12:16 PM

#

ok, so a spike at 0 and 44099 would be a 1hz signal

#

right?

karmic geyser Jun 5, 2019, 12:18 PM

#

I pretty much have a microphone/ line input that I am playing music from my phone through. I then grab 1024 samples on my computer. apply some processing then output the 1024 samples to my computers headphones. the latency is somewhere under 0.1 seconds. all this is happening in real time and constantly streaming from an input to an output.

#

I think so.

desert oar Jun 5, 2019, 12:18 PM

#

ok

karmic geyser Jun 5, 2019, 12:18 PM

#

a 1 hz signal would pretty much be a sine wave that repeats every 1 second.

desert oar Jun 5, 2019, 12:18 PM

#

well anyway does your hand-written code work?

#

its just slow?

#

cause i feel like we can definitely get the butterworth filter working, but yes you can probably write your code in cython for a significant speedup

#

can you give an example signal i can test with

karmic geyser Jun 5, 2019, 12:20 PM

#

I can open a wave file, apply the processing then output a wavefile with some python libaries I think.

desert oar Jun 5, 2019, 12:20 PM

#

ok, thats not really what im asking

#

but if you find a signal processing library that's probably the best option

karmic geyser Jun 5, 2019, 12:24 PM

#

I can't really give the same samples as I'm streaming it from an input. I can program some stuff to read/write the samples though.

desert oar Jun 5, 2019, 12:25 PM

#

any kind of test data should work

#

how are you testing your code in the first place?

#

anyway, in your code there are a few things that can be optimized

#

the variable n is completely unnecessary, it's always just equal to bpos_max by the time you use it

#

im still not totally sure what x is but if it's a sliding window, then you can obtain that much more efficiently

#

also i += 1 should be ever so slightly more efficient than i = i + 1 which might matter on an embedded device

karmic geyser Jun 5, 2019, 12:29 PM

#

yeah I was doing some stuff like adding 2 values then dividing it by n then resetting n back to zero. x is the summed value of the previous "bpos_max" samples

#

Yeah I wanted to do i++; but python doesn't have that haha

#

N is redundant with how it is atm.

desert oar Jun 5, 2019, 12:33 PM

#

what happens if i is 3 and bpos_max is 16?

#

you just sum the 0-3rd values?

karmic geyser Jun 5, 2019, 12:37 PM

#

pretty much I'm adding the previous chunks 16 samples to the start of the algorithm I'm summing samples 0-15 + the current sample then dividing it by the total samples which is 16. I'm then moving an offset by 1 and repeating it again up to the total number of samples in the chunk. I'm then setting the buffer to the last 16 samples of that chunk

desert oar Jun 5, 2019, 12:37 PM

#

so what happens when you're at the 3rd sample in the signal

#

you don't have 16 previous samples

karmic geyser Jun 5, 2019, 12:38 PM

#

the summed + divided value is then stored as a sample in an array to be returned

#

the first time I run it the 16 samples are all equal to 0.0

desert oar Jun 5, 2019, 12:38 PM

#

got it

#

use the moving average code i sent

#

in the stackoverflow examples

#

as long as you aren't extremely memory constrained it's the most efficient option

#

it pre-computes the entire sequence of cumulative sums

#

then subtracts off whatever is outside the window

karmic geyser Jun 5, 2019, 12:40 PM

#

yeah, I'm only cpu constrained. I got 512 megabytes of memory on the device that will run the code.

#

I think it's only like 35 kilobytes of samples.

desert oar Jun 5, 2019, 12:44 PM

#

alright let me think about this

karmic geyser Jun 5, 2019, 12:44 PM

#

It's ~1.5 megabytes a second of samples but I'm doing it at roughly 43 chunks a second so I don't need much memory. It's mainly a lot of operations.

desert oar Jun 5, 2019, 12:44 PM

#

yes you can rewrite this in cython and should see significant improvements

karmic geyser Jun 5, 2019, 12:44 PM

#

Ideally I would use someones library, put the filter values in then they would probably do it in c++.

#

Like what numpy and scipy probably does right?

desert oar Jun 5, 2019, 12:45 PM

#

yeah. or fortran 😉

karmic geyser Jun 5, 2019, 12:46 PM

#

Haha why not verilog 😛

desert oar Jun 5, 2019, 12:53 PM

#

hmm im confused as to how this buffer is working

stoic beacon Jun 5, 2019, 12:53 PM

#

Can I jump in to ask a dumb and unrelated question?

desert oar Jun 5, 2019, 12:53 PM

#

sure

karmic geyser Jun 5, 2019, 12:53 PM

#

Always

desert oar Jun 5, 2019, 12:53 PM

#

@karmic geyser it looks like you never update the buffer contents until the end of the loop

karmic geyser Jun 5, 2019, 12:54 PM

#

Yeah I don't need to update it until the end, it was a small optimisation I thought of haha.

        i = 0
        x = 0
        n = 0
        for cur_sample in in_data:
            for sample in buffer:
                x += + sample
                n += + 1
                
            buffer.replaceOldest(cur_sample)
            new_signal[i] = (x + cur_sample) / n
            
            i += 1
            x = 0
            n = 0
            
        return new_signal

desert oar Jun 5, 2019, 12:54 PM

#

then how is the buffer being populated

karmic geyser Jun 5, 2019, 12:54 PM

#

that was the old code before I did some optimising and maybe changed it.

desert oar Jun 5, 2019, 12:54 PM

#

arent you just pulling the first 16 values all the time?

stoic beacon Jun 5, 2019, 12:54 PM

#

Been watching some TensorFlow videos and I honestly have no idea what's happening. I've watched enough high level videos and read enough articles that I generally understand how a neural net works but TensorFlow code is confusing me to shit. That being said, would Keras be sufficient for any stupid project I want to do? I don't need to do anything super scholarly or hardcore or like...cutting edge. Would the simplicity and ease of understanding of Keras be better?

desert oar Jun 5, 2019, 12:54 PM

#

probably sufficient

#

but what part of tensorflow is confusing?

#

like... what's the most sophisticated code that you can understand?

#

also are you sure you understand how a NN works? i dont mean to be confrontational, sometimes we think we know more than we do

stoic beacon Jun 5, 2019, 12:55 PM

#

Just the general workflow in terms of creating the actual net. What are Placeholders, what are Variables, what's a Graph, how do you create the layers, etc

desert oar Jun 5, 2019, 12:55 PM

#

do you know what backpropagation is,what gradients are, etc.?

stoic beacon Jun 5, 2019, 12:56 PM

#

I understand that there are input neurons which hold your data and the connections hold the weights and the hidden layers perform some activation function on your data

desert oar Jun 5, 2019, 12:56 PM

#

ok... you'll need a more technical understanding than that in order to understand tensorflow

#

which i highly recommend developing. but for now keras is probably more friendly for your use case

stoic beacon Jun 5, 2019, 12:57 PM

#

Then some calculus is used to find the minimum point, a la gradient descent

desert oar Jun 5, 2019, 12:57 PM

#

tensorflow is really a "differentiable tensor computation graph engine"

#

for which NNs happens to be the most immediate use

stoic beacon Jun 5, 2019, 12:57 PM

#

I gatcha

#

I must've watch that 3b1b video on NNs three or four times and yeah I clearly already forgot the big parts

#

Watched

desert oar Jun 5, 2019, 12:58 PM

#

yeah if you don't feel comfortable with the equations, you will struggle to make TF work for you

stoic beacon Jun 5, 2019, 12:58 PM

#

I was able to give a better detail a few days ago

desert oar Jun 5, 2019, 12:58 PM

#

layers are kind of an abstraction

stoic beacon Jun 5, 2019, 12:58 PM

#

So TensorFlow really kinda forces you to understand the math?

#

Not to sound like an ignorant fool or someone who doesn't want to learn it all, I just simply don't have the time

#

So something that abstracts away some of that math is probably best

karmic geyser Jun 5, 2019, 1:00 PM

#

the buffer keeps its state between function calls. Think of it as me remembering the last "n" samples if I have all the samples in memory and I know what position I am up to then I don't need to write anything to the buffer until I am done with the current chunk then I just save the last few values of the chunk that the next function call will need to smoothly apply the algorithm, I could replace the oldest value in the buffer with the newest sample and shift the index by 1 every time but it's just not needed and slower then the way I switched too even though it was easier to read.

#

What do you want to do with machine learning?

#

@stoic beacon

desert oar Jun 5, 2019, 1:01 PM

#

yeah it does @stoic beacon

stoic beacon Jun 5, 2019, 1:02 PM

#

@desert oar I had a feeling haha. The series I was watching had him creating out the actual z = xw + b and I'm like...wut

#

@karmic geyser oh just stupid work things. I try to self improve every so often and I pick a topic I'm interested in and try to learn some of it

#

Without going too deep into any one thing. Jack of all trades, master of none kind of thing but I'm okay with that

desert oar Jun 5, 2019, 1:04 PM

#

@karmic geyser im confused as to what your code is doing then

stoic beacon Jun 5, 2019, 1:04 PM

#

I enjoy Python so I picked ML to practice some Python while learning something that interests me

desert oar Jun 5, 2019, 1:04 PM

#

@karmic geyser how many elements is filterdata operating on at once?

#

if you can explain your algorithm in words it might help

karmic geyser Jun 5, 2019, 1:05 PM

#

You can probably find tutorials for tensorflow with stuff like "tensor flow character recognition" in google. I think there was a white to black 32x32 pixel thing that you trained to recognise letters.

stoic beacon Jun 5, 2019, 1:05 PM

#

And following a tutorial is great but it wouldn't be self improvement if I just blindly follow a tutorial and can't understand it

desert oar Jun 5, 2019, 1:05 PM

#

precisely

stoic beacon Jun 5, 2019, 1:06 PM

#

Even if I just loosely understand what Keras is doing I'd be happy haha

#

Even if I have to use the words "magic" and "awesome maths stuff"

#

And I don't usually put "awesome" and "math" in the same sentence

karmic geyser Jun 5, 2019, 1:13 PM

#

pretty much make a 32x32 pixel image with a character in it. apply some kind of distortions/blurs. have like 10 different ones for each character. you use that as your training data. it then sets weights of "neurons" so that it gets as close as possible to 100% correct guesses of your training data. pretty much the "neurons" will find patterns in the data based on the intensity and position of values and how they compare to ones adjacent to them, it will then "guess" at what the character should be.

#

you could do stuff like trying to centre the character in the image before your program guesses it as that could improve the accuracy.

#

a neural network pretty much compares the inputs to each other to come up with an output. you tune the neural networks parameters/shape/size/weights/how inputs are linked ect so that the output gives you the output that you want most of the time.

#

That's supervised learning.

#

@desert oar filterdata operates on an input of 1024 floats in an array, it returns 1024 floats It has an internal memory that stores the last 16 elements of the previous array it was passed and uses those to continue from where it last got up too. the output samples are 17 samples added together then divided by 17, it then offsets everything by 1 to generate the next sample. first sample uses all 16 values from buffer + 1 from input array. second sample uses the newest 15 samples from buffer + 2 from input array. until it's using 0 samples from buffer and 17 samples from the input array. from that point onward it then just offsets 1 by 1 along the input array calculating a sample from 17 samples until it gets to the end of the array, at that point it saves the last 16 values of the array to the buffer to use next time.

desert oar Jun 5, 2019, 1:59 PM

#

yeah but you're always just using the final 16 elements from the previous array

#

there's no window sliding over the current array

#

also why are you even chunking it up like that

#

if your whole array can fit in memory then just do it all in one pass

karmic geyser Jun 5, 2019, 2:00 PM

#


    while i < (samples):
        while ii < bpos_max:
            if (ii + i) >= bpos_max:
                x = x + in_data[i+ii-bpos_max]
            elif (ii + i) < bpos_max:
                x = x + buffer.getValue(i+ii)

#

that starts off with the 16 values from buffer then 15 then 14. slowly adding values from the regular array.

desert oar Jun 5, 2019, 2:01 PM

#

but where are you adding values

#

ohhhh

#

oh i see

#

ok

#

err why are you using this circular thing at all then

#

vs just storing the last 16 values in a regular old array?

karmic geyser Jun 5, 2019, 2:01 PM

#

the if and elif are swapped around cause I figured that 98% of the time the first if statement is true.

desert oar Jun 5, 2019, 2:01 PM

#

also you can just do else instead of that elif

karmic geyser Jun 5, 2019, 2:04 PM

#

yeah the else also works.

desert oar Jun 5, 2019, 2:04 PM

#

so why not just store the last 16 elements of the previous array?

#

why this ring buffer business?

karmic geyser Jun 5, 2019, 2:05 PM

#

Originally every sample was going into the ring buffer haha. Then I realised I didn't need to do that. I will replace it with a list

desert oar Jun 5, 2019, 2:06 PM

#

also why do this in chunks of 1024

#

instead of just.. an array

#

oh cause youre reading them 1024 at a time?

karmic geyser Jun 5, 2019, 2:07 PM

#

yes. It's going to be a digital crossover for speakers

#

1024 samples is enough that I should be able to do most algorithms and the delay isn't too much. Also less overhead then if I were to do it in chunks of 64 ect.

#

slow computers can't handle low chunk size as well. it adds stuttering to the playback

#

I'm going to have a small linux device that recieves an audio signal via spdif or 3.5mm stereo jack, then output it to 3 different 3.5mm jacks with different processing based on what kind of speaker it is. Only send bass to the subwoofer. high frequency to tweeters ect. if you send too high an amplitude signal at a low frequency to a tweeter it will break it. and if you send too much high frequency stuff to a subwoofer the bass will not be as clear.

desert oar Jun 5, 2019, 2:16 PM

#

if you send me some sample data i can test my implementation

#

sample inputs and outputs

karmic geyser Jun 5, 2019, 2:17 PM

#

Pretty much my $350 speakers amplifier/subwoofer broke and it was proprietary. They don't make them anymore. I managed to find someone who was selling the exact same speakers cause they had the same problem and I got them for $30.

#

Yeah I'm writing a python script to read a wave file and pass it in chunks

olive willow Jun 5, 2019, 2:27 PM

#

guys so tensorflow is for ML and keras DL

karmic geyser Jun 5, 2019, 2:28 PM

#

deep learning is like an onion

#

it has lots of layers

olive willow Jun 5, 2019, 2:28 PM

#

yh

#

btw what do you need for DL

#

?

#

ML, math, programming, understanding of data. and what more?

karmic geyser Jun 5, 2019, 2:32 PM

#

Understanding of the industry/field that you are trying to use deep learning with.

olive willow Jun 5, 2019, 2:33 PM

#

I'm not in an industry yet but I want to go e-commerce or social media

karmic geyser Jun 5, 2019, 2:35 PM

#

social media it would probably help to know a bit about fake accounts and how to identify them to clean up your test data. e-commerce maybe it would help to know what data to collect on customers, if in doubt all of it that is legal haha

#

I'm not in the fields I was just trying to think of examples of things that would be industry specific knowledge that is relevant to the problems.

olive willow Jun 5, 2019, 2:36 PM

#

so do you use ML or DL for targeted ads?

karmic geyser Jun 5, 2019, 2:36 PM

#

I would say deep learning is still machine learning. It's just a catergory of machine learning that is more complicated.

olive willow Jun 5, 2019, 2:37 PM

#

I never have been in an industry

#

ooohhh sure

karmic geyser Jun 5, 2019, 2:37 PM

#

Generally with machine learning you want it to be as simple as possible to get the result you want/are looking for.

olive willow Jun 5, 2019, 2:38 PM

#

yh not to over do it especially with data science

#

do you do predictive analytics with ML or just an algorithm

karmic geyser Jun 5, 2019, 2:38 PM

#

more complicated makes it harder to train, harder to understand, harder to predict ect.

olive willow Jun 5, 2019, 2:38 PM

#

yh

karmic geyser Jun 5, 2019, 2:39 PM

#

predictive analytics is something you do.

olive willow Jun 5, 2019, 2:39 PM

#

ooh sure

karmic geyser Jun 5, 2019, 2:39 PM

#

machine learning is one tool you can use to get there.

olive willow Jun 5, 2019, 2:39 PM

#

yh that's what I meant

#

can you give me an example of uses of ML in data science like for example the analyzing part

karmic geyser Jun 5, 2019, 2:40 PM

#

an example of predictive analytics might be this

#

https://repositories.lib.utexas.edu/handle/2152/45875

#

what do you mean by the analyzing part?

olive willow Jun 5, 2019, 2:41 PM

#

so if you already have pre processed the data and you start analyzing it

karmic geyser Jun 5, 2019, 2:42 PM

#

Most machine learning data is pre processed before you give it to the machine learning code.

olive willow Jun 5, 2019, 2:43 PM

#

yh but what can you do with ML in data science? like group data that is coming every second for example a red flag of a bank transfer

desert oar Jun 5, 2019, 2:44 PM

#

yes

#

ML has subsumed a lot of what would have traditionally been called statistics

olive willow Jun 5, 2019, 2:45 PM

#

oooh sure

desert oar Jun 5, 2019, 2:45 PM

#

basically "automated predictive modeling" = "machine learning", "one-off modeling or causal analysis" = "statistics" 🤷

#

the distinction is more one of application nowadays

#

the methods are different though

#

eg you wouldnt typically use a random forest to infer the distribution of cancer cell sizes

olive willow Jun 5, 2019, 2:45 PM

#

what's modeling I've seen it sooo many times but don't really know the definition

#

yh I know that

desert oar Jun 5, 2019, 2:45 PM

#

nor would you typically use a hierarchical bayesian model to run online fraud detection

#

unless you could optimize it for such a purpose

#

actually thats kinda not true, you totally could

#

but you know what i mean

#

modeling is... fitting a model

#

implied, with the purpose of capturing some truth about the world

#

rather than "just" making predictions

olive willow Jun 5, 2019, 2:47 PM

#

but you would use a linear regression model for example with sensor data to see if it didn't overheat for example and give out an inaccurate measurement

desert oar Jun 5, 2019, 2:47 PM

#

sure

#

linear regression kind of sits at the intersection between "machine learning" and "statistics"

karmic geyser Jun 5, 2019, 2:47 PM

#

Okay say you have some basic maths algorithms and you are a bank and you use it to work out whether you should or shouldn't give someone a loan. you can then keep data on all the loans you have and use machine learning to try find patterns in people who defaulted on those loans. you also could give loans to people that don't quite pass the basic maths algorithm to get more data and then from that use machine learning based on whether they defaulted on the loan or not.

olive willow Jun 5, 2019, 2:47 PM

#

yh

desert oar Jun 5, 2019, 2:47 PM

#

what rayzar said, but note that said "machine learning" algorithm could well be based on a statistical model

olive willow Jun 5, 2019, 2:48 PM

#

but what's a model??

desert oar Jun 5, 2019, 2:48 PM

#

a representation of the world

#

in mathematical terms

karmic geyser Jun 5, 2019, 2:48 PM

#

but why male models?

olive willow Jun 5, 2019, 2:48 PM

#

oohh ok, kinda understand it more now

#

thanks guys1

#

!

desert cradle Jun 5, 2019, 2:51 PM

#

@karmic geyser Are you serious? I just told you like, a second ago.

#

😛

karmic geyser Jun 5, 2019, 2:51 PM

#

lmfao

#

I have been saying that for a while and you are the first person to get it hahaha

desert cradle Jun 5, 2019, 2:52 PM

#

fun fact, that bit was ad-libbed because Stiller forgot his next line

karmic geyser Jun 5, 2019, 2:53 PM

#

You can't script that kinda stuff haha

#

stupid question here, if I have a numpy array and I want to fill it from the start with data until another array is empty how would I do it?

#

array from shape (211,2) into shape (1024,2)

stoic beacon Jun 5, 2019, 3:12 PM

#

Stochastic gradient descent is a cost function right?

karmic geyser Jun 5, 2019, 3:13 PM

#

"Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties" I think so.

#

I think it might be a part of minimizeing your cost function?

#

I'm not too sure sorry.

desert oar Jun 5, 2019, 3:16 PM

#

SGD is a convex optimization algorithm

#

just like newton's method, conjugate gradient descent, L-BFGS, etc

karmic geyser Jun 5, 2019, 3:20 PM

#

@desert oar You know how I can put a small numpy array into a bigger array. pretty much its shape is (1024,2) for the big one and (<1024,2) for the second one.

#

I want to put it in right at the start.

#

Okay I got it sweet.

stoic beacon Jun 5, 2019, 3:24 PM

#

Thanks guys

#

Sorry for the dumb questions

desert oar Jun 5, 2019, 3:24 PM

#

@karmic geyser what order? you want to insert them rowwise?

karmic geyser Jun 5, 2019, 3:25 PM

#

outdata[:view.shape[0]][:] = view

#

That seemed to work.

desert oar Jun 5, 2019, 3:25 PM

#

try this instead

outdata[:view.shape[0], :] = view

karmic geyser Jun 5, 2019, 3:25 PM

#

oh woops thanks I made a typo and didn't notice

desert oar Jun 5, 2019, 3:35 PM

#

@karmic geyser https://github.com/gwerbin/python-discord_signal-filter

GitHub

gwerbin/python-discord_signal-filter

Contribute to gwerbin/python-discord_signal-filter development by creating an account on GitHub.

karmic geyser Jun 5, 2019, 3:36 PM

#

I have almost done the wave file thing for you.

desert oar Jun 5, 2019, 3:36 PM

#

its fine, i wrote it untested

#

i needed to learn how to use arrays in cython properly anyway

#

let me know if it works for you

#

or if it breaks 😉

karmic geyser Jun 5, 2019, 3:39 PM

#

haha, i will give it a try.

desert oar Jun 5, 2019, 3:40 PM

#

you should be able to use it with from speaker_filter import filter_signal

karmic geyser Jun 5, 2019, 4:09 PM

#

@desert oar I sent you the files. It's 2:30 am so I will get some sleep then look into using the filter you did in cython

prime elm Jun 5, 2019, 4:22 PM

#

# Contouring and Plant Detection

cv2.imwrite('saved_mask.jpg', mask)
reuploadedImage = 'saved_mask.jpg'
preBlobD = cv2.imread(reuploadedImage, 1)
# cv2.imshow("Mask", preBlobD)

font = cv2.FONT_HERSHEY_COMPLEX

start = 1
for i in range(48):

    lower_value = np.array([0, 0, 0])
    upper_value = np.array([180, 255, 255])


    blobDetection = cv2.inRange(preBlobD, lower_value, upper_value)

    # Contours detection
    contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

    for cnt in contours:
        partLabel = "{}".format(start)

        area = cv2.contourArea(cnt)
        approx = cv2.approxPolyDP(cnt, 0.02 * cv2.arcLength(cnt, True), True)
        x = approx.ravel()[0]
        y = approx.ravel()[1]

        if area > 80:
            cv2.drawContours(preBlobD, [approx], 0, (0, 0, 0), 1)

            if start >= 49:
                break
                #Try to find the Center Here

            if len(approx) > 2:
                cv2.putText(preBlobD, partLabel, (x, y), font, 1, (255, 255, 255))

                start = start + 1

cv2.imshow("Mask", preBlobD)```

#

ops

#

So i have this code that i wrote

#

and write now it functions to create countours of the masks (of another image)

#

ideally i want to get a list of all the x plots and y plots of each contour vertex

#

i need this because i want to take the averages of it so that I can find the center point of each plant (the subject of the masks)

#

📎 Screen_Shot_2019-06-05_at_8.15.36_AM.png

#

does anyone know how and if thats possible

olive willow Jun 5, 2019, 4:27 PM

#

sorry bro i can't help you, ask maybe nix

prime elm Jun 5, 2019, 4:28 PM

#

@earnest prawn

earnest prawn Jun 5, 2019, 4:28 PM

#

why am i always listed as a reference for this i have no idea about data science

prime elm Jun 5, 2019, 4:28 PM

#

actualjly my mentor for my internship just called and i have to go to a meeting 😞

#

cya ill be back with the same q :3

olive willow Jun 5, 2019, 4:33 PM

#

@earnest prawn because you're an smart boy

earnest prawn Jun 5, 2019, 4:33 PM

#

i am smart to the extent that i can use google (at least for this topic)

olive willow Jun 5, 2019, 4:43 PM

#

than that makes two of us!

#

hahahah

prime elm Jun 5, 2019, 5:05 PM

#

Does anyone have any suggestions

#

@olive willow @earnest prawn sry bout ping i try not to

olive willow Jun 5, 2019, 5:06 PM

#

IDK i'm 14 dude still learning even the math hahahhaha

prime elm Jun 5, 2019, 5:07 PM

#

ooo lolol

#

youngin

#

good for you

#

😄

olive willow Jun 5, 2019, 5:07 PM

#

hhahahahahha

prime elm Jun 5, 2019, 5:09 PM

#

ill be back i need to 3d print something. id appreciate if seomeone could help me with this. its a gate to moving on in my code

#

❤

olive willow Jun 5, 2019, 5:09 PM

#

sure

earnest prawn Jun 5, 2019, 5:45 PM

#

@prime elm dont get me wrong, you can ping me as much as you like as long as its not spam, however regarding this topic I will very unliekly be of great use for you

prime elm Jun 5, 2019, 6:07 PM

#

@earnest prawn I get that. I was just told to. if you know someone who could help with this id apprecaite. I know how to find a centroid using pixel averages, but it doesnt solve what im aiming to do next, so i need helping using the vertexs of the polygon to find the center point

#

So I found the data is stored in the variable like this

#


 [[892 563]]

....

 [[896 577]]

 [[897 576]]]```

#

with the left colummn denoting x coordinate

#

and right denoting y

#

how would i extract them and put them into two seperate lists?

prisma verge Jun 5, 2019, 6:42 PM

#

so
well
i'll just say that keras is very amazing framework

#

it's like python but for deep learning
it makes things very simple but yet flexible

#

i was able to create my own network thanks to keras even when being dumb and not knowing math at all

#

so yeah

#

that's an amazing lib

#

though i find data preprocessing quite hard

#

anyone got good libs to simplify that except opencv for images?

prime elm Jun 5, 2019, 6:47 PM

#

@prisma verge is that data type i posted above an array?

#

is it a 2 column array or one coloumn

prisma verge Jun 5, 2019, 6:48 PM

#

have no idea, sorry

olive willow Jun 5, 2019, 7:07 PM

#

@prime elm explain

#

it's a 3d array to be exact

#

because it has a list inside a list inside a list

#

so 3 lists

#

3d

#

you need in list less

#

@prime elm can you post the entire code or at least how you imported the data

prime elm Jun 5, 2019, 7:32 PM

#

@olive willow sure

#

# # # Start Up
#_______________________________________________________________________________________________________________________

### Library
import matplotlib as plt
import numpy as np
import operator
import cv2

# Select Image
imageName = '04.24-13.26.jpg'
img = cv2.imread(imageName, 1)

# # # 1st Section of Code - Image Processing
#_______________________________________________________________________________________________________________________

# Grid Overlay
cv2.rectangle(img, (130, 25), (925, 340), (255, 255, 255), 1)

#Do the Processing
    # Color Filtering
        # HSV - Hue, Sat, Value
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
lower_green = np.array([13, 0, 0])
upper_green = np.array([90, 225, 254])

mask = cv2.inRange(hsv, lower_green, upper_green)
res = cv2.bitwise_and(img, img, mask= mask)

    # Morphology
        #laplacian
laplacian = cv2.Laplacian(mask, cv2.CV_64F)

# Show the Image
# cv2.imshow('laplacian', laplacian)
# cv2.imshow('mask', mask)
# cv2.imshow('res', res)
cv2.imshow('Array of Plant Pots', img)

#


# # # 2nd Section of Code - Contouring and Blob Detection
#_______________________________________________________________________________________________________________________

# Contouring and Plant Detection

cv2.imwrite('saved_mask.jpg', mask)
reuploadedImage = 'saved_mask.jpg'
preBlobD = cv2.imread(reuploadedImage, 1)
# cv2.imshow("Mask", preBlobD)

font = cv2.FONT_HERSHEY_COMPLEX

start = 1
for i in range(48):

    lower_value = np.array([0, 0, 0])
    upper_value = np.array([180, 255, 255])


    blobDetection = cv2.inRange(preBlobD, lower_value, upper_value)

    # Contours detection
    contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

    for cnt in contours:
        partLabel = "{}".format(start)

        area = cv2.contourArea(cnt)
        approx = cv2.approxPolyDP(cnt, 0.02 * cv2.arcLength(cnt, True), True)
        x = approx.ravel()[0]
        y = approx.ravel()[1]

        if area > 80:
            cv2.drawContours(preBlobD, [approx], 0, (0, 0, 0), 1)

            if start >= 49:
                break
                #Try to find the Center Here

            if len(approx) > 2:
                cv2.putText(preBlobD, partLabel, (x, y), font, 1, (255, 255, 255))

                start = start + 1

cv2.imshow("Mask", preBlobD)```

#


# # # 3rd Section of Code - Pixel Count and Surface Area of Each Plant
#_______________________________________________________________________________________________________________________

# Pixel Counter 0.2
# Recursion for Reading Masks
iteration = 0
for i in range(8):
    # Moving Frame Over Horizontally
    small_x = 150 + (92*i) # x_value refers to moving the frame left to right as columns
    big_x = 250 + (92*i)

    for j in range(6):
        # Moving Frame Over Vertically
        small_y = 35 + (101*j) # upper y_value has smaller numerical value
        big_y = 151 + (101*j)# lower y_value has bigger numerical value

        # Attempting to Output Grid as Multiple Windows
        fileString = 'saved_mask{}.jpg'.format(iteration)
        cv2.imwrite(fileString, mask[small_y:big_y, small_x:big_x])
        gridOutput = cv2.imread(fileString, 1)
        gridPart = 'Grid Part #{}'.format(iteration+1)
        cv2.imshow(gridPart, gridOutput)

        iteration = iteration + 1

        x = []
        y = []
        px = 0
        gridImg = gridOutput.astype('float')
        gridImg = gridImg[:,:,0] # convert to 2D array
        row, col = gridImg.shape
        for i in range(row):
            for j in range(col):
                if gridImg[i,j] == 255:
                    x.append(j) # get x indices
                    y.append(i) # get y indices
                    px = px + 1

        # Measuremets
        # px : mm^2 :: 1 : (110/93)^2
        print("\nGrid Part #%i\n----------------\nX values of identified pixels: %s\nY values of identified pixels: %s\n\nNumber of Pixels(s):%i\nSurface Area (1 px : ((11/9)^2) mm^2): %i" % (iteration, str(x), str(y), px, px * (110/93)^2))

# # Close and Exit
# cv2.waitKey(0)
# cv2.destroyAllWindows()

olive willow Jun 5, 2019, 7:35 PM

#

what's the problem

prime elm Jun 5, 2019, 7:35 PM

#

### FIN```

#

So

#

basically

#

I need the x and y coordinates of each contour

olive willow Jun 5, 2019, 7:36 PM

#

whats a contour

prime elm Jun 5, 2019, 7:36 PM

#

the contours are built in the for loop

#

a contour is a blob isolating part of open cv

#

and it basically wraps around shapes

olive willow Jun 5, 2019, 7:37 PM

#

sure so when a color changes you want the x and y of that

prime elm Jun 5, 2019, 7:37 PM

#

not color

#

in order to draw a shape

olive willow Jun 5, 2019, 7:38 PM

#

Iknow

prime elm Jun 5, 2019, 7:38 PM

#

u need to draw lines to each vertex. and i want those vertexs

#

bc using that i can find the center of the shape

olive willow Jun 5, 2019, 7:39 PM

#

ooohh then I can't help you, I'm too noob for that

prime elm Jun 5, 2019, 7:39 PM

#

lol thanks for tryin

olive willow Jun 5, 2019, 7:39 PM

#

sure np sorry maybe nix

prime elm Jun 5, 2019, 7:39 PM

#

oof. idk if any of the admins know, I hope i figure out this problem

olive willow Jun 5, 2019, 7:39 PM

#

yh

prime elm Jun 5, 2019, 7:40 PM

#

the data above is in an array and im not good with thinking like a programmer fully yet. so idk how to extract only the x values from that array

#

and only the y values

olive willow Jun 5, 2019, 7:40 PM

#

with that I can help

#

I'm 14 so idk how to think either

#

so

#

so

#

so

prime elm Jun 5, 2019, 7:41 PM

#

oo

olive willow Jun 5, 2019, 7:42 PM

#

so you save the coordinates as an 2d numpy array?

prime elm Jun 5, 2019, 7:42 PM

#

errr let me paste

#

nope thats for pixel counting

#

all sections of code work

#

that section is inaccurate

#

bc what it does is

olive willow Jun 5, 2019, 7:44 PM

#

only not the dimension in the array?

#

because the code you've send before was 3d

#

you are somewhere adding another layer I think

prime elm Jun 5, 2019, 7:45 PM

#

the 3d array i think u said is the one im trying to extract in for from

#

im not adding in the code

#

um look for the variable called "approx"

olive willow Jun 5, 2019, 7:45 PM

#

yh

#

I know

prime elm Jun 5, 2019, 7:45 PM

#

its in section 2

#

yuh

#

i pasted that variable to see what the data looks like

olive willow Jun 5, 2019, 7:46 PM

#

approx = cv2.approxPolyDP(cnt, 0.02 * cv2.arcLength(cnt, True), True)

prime elm Jun 5, 2019, 7:46 PM

#

yuh

#

so ik the data is there

#

i just need to extract the x and y

olive willow Jun 5, 2019, 7:46 PM

#

can you print it

prime elm Jun 5, 2019, 7:46 PM

#

ye

#

[[[897 568]]

[[892 563]]

[[891 563]]

[[890 562]]

[[878 562]]

...

[[895 579]]

[[896 578]]

[[896 577]]

[[897 576]]]

olive willow Jun 5, 2019, 7:47 PM

#

np.array(aprox) maybe

#

but why is it an 3d array?

prime elm Jun 5, 2019, 7:48 PM

#

so make a variable

coord_array = np.array(approx)```

#

i dont know

olive willow Jun 5, 2019, 7:48 PM

#

yes

prime elm Jun 5, 2019, 7:48 PM

#

its part of the library

olive willow Jun 5, 2019, 7:48 PM

#

and then print it out

prime elm Jun 5, 2019, 7:48 PM

#

aight

olive willow Jun 5, 2019, 7:48 PM

#

lets see

prime elm Jun 5, 2019, 7:51 PM

#

still the same

#

[[[897 568]]

[[892 563]]

[[891 563]]

[[890 562]]

#

...

#

etc

#

haha 😅

olive willow Jun 5, 2019, 7:52 PM

#

do print(coord_array[0])

prime elm Jun 5, 2019, 7:52 PM

#

kk

#

[[897 568]]

olive willow Jun 5, 2019, 7:54 PM

#

and change it to, [0, 0]

prime elm Jun 5, 2019, 7:54 PM

#

how would i do that

#

im bad at arrays

olive willow Jun 5, 2019, 7:54 PM

#

just coord_array[0, 0]

prime elm Jun 5, 2019, 7:54 PM

#

ooo thats what u mean

#

haha

olive willow Jun 5, 2019, 7:54 PM

#

hhaha

prime elm Jun 5, 2019, 7:54 PM

#

[897 568]

#

got rid of a []

olive willow Jun 5, 2019, 7:55 PM

#

now [0, 0, 0]

prime elm Jun 5, 2019, 7:55 PM

#

u think it a 2d array thats just in a 3d?

olive willow Jun 5, 2019, 7:55 PM

#

cuz it's 3d

#

so add another layer

prime elm Jun 5, 2019, 7:55 PM

#

897

olive willow Jun 5, 2019, 7:55 PM

#

peel it like an onion boy

#

yeeessss

prime elm Jun 5, 2019, 7:56 PM

#

so its just a 2d array as a 3d

olive willow Jun 5, 2019, 7:56 PM

#

no look

prime elm Jun 5, 2019, 7:56 PM

#

and all i got to do is make a for loop to pile it into a list

olive willow Jun 5, 2019, 7:56 PM

#

there are 3 dimensions in your array

#

[[[ ]]]

prime elm Jun 5, 2019, 7:56 PM

#

haha true. but only 2 sets of data

olive willow Jun 5, 2019, 7:56 PM

#

those are 1d

prime elm Jun 5, 2019, 7:56 PM

#

so one deminsion is empty

#

oh

olive willow Jun 5, 2019, 7:57 PM

#

no

#

[ 3d [ 2d [ 1d ] 2d ] 3d ]

#

your info is in the 1d

#

it's a vector

prime elm Jun 5, 2019, 7:58 PM

#

a . a

#

ops didnt meant to do that

olive willow Jun 5, 2019, 7:58 PM

#

with an x and y cord

#

hahah

prime elm Jun 5, 2019, 7:58 PM

#

......
c . c
b . b
a . a

#

so its two peices of info going back ^

#

like that, visually

olive willow Jun 5, 2019, 7:59 PM

#

kinda

prime elm Jun 5, 2019, 7:59 PM

#

but bsaically it is missing one "visual" deminsion is what im saying

olive willow Jun 5, 2019, 8:00 PM

#

kinda

prime elm Jun 5, 2019, 8:00 PM

#

like i can hold one of the deminsions constant

olive willow Jun 5, 2019, 8:00 PM

#

yes

prime elm Jun 5, 2019, 8:00 PM

#

and us a for loop

olive willow Jun 5, 2019, 8:00 PM

#

because nothings there

prime elm Jun 5, 2019, 8:00 PM

#

yuh

#

and use a for loop to pull out the data

olive willow Jun 5, 2019, 8:00 PM

#

yh

prime elm Jun 5, 2019, 8:01 PM

#

how does the len() function work on arrays?

olive willow Jun 5, 2019, 8:01 PM

#

idk try it

prime elm Jun 5, 2019, 8:03 PM

#

rip says 85

#

time to count

#

actually word counter.com

#

170 words

#

/2

#

85 lines

#

so its counts the group, not the number of elements

olive willow Jun 5, 2019, 8:05 PM

#

yh

#

but you need to do [0, 0]

#

remember

prime elm Jun 5, 2019, 8:06 PM

#

hmmm

#

it says 2 lol

#

i dont think im getting what u mean 😄

olive willow Jun 5, 2019, 8:07 PM

#

yh 2 numbers

prime elm Jun 5, 2019, 8:07 PM

#

i get that

#

but like i guess i still dont know the angle to attack it

olive willow Jun 5, 2019, 8:07 PM

#

[ 464 346 ]

#

those are 2 items

#

o and 1

#

it's going by index

#

do str()

prime elm Jun 5, 2019, 8:08 PM

#

hm?

olive willow Jun 5, 2019, 8:09 PM

#

if you want the count

prime elm Jun 5, 2019, 8:09 PM

#

len(str(coord_array))

olive willow Jun 5, 2019, 8:09 PM

#

yh

prime elm Jun 5, 2019, 8:09 PM

#

and []?

#

any?

#

olive willow Jun 5, 2019, 8:09 PM

#

is it printing it out

prime elm Jun 5, 2019, 8:10 PM

#

it says 9

olive willow Jun 5, 2019, 8:10 PM

#

idk dude sorry have to go

prime elm Jun 5, 2019, 8:10 PM

#

npnp

olive willow Jun 5, 2019, 8:10 PM

#

bye ask nix

prime elm Jun 5, 2019, 8:10 PM

#

thanks for ur help

rancid gust Jun 6, 2019, 10:06 AM

#

Hey, there is any easy way to remove matplotlib axis while keeping it's grids ?

olive willow Jun 6, 2019, 10:21 AM

#

I'm not sure but there has to be

sand reef Jun 6, 2019, 10:22 AM

#

@rancid gust

#

https://stackoverflow.com/questions/20416609/remove-the-x-axis-ticks-while-keeping-the-grids-matplotlib

Stack Overflow

Remove the x-axis ticks while keeping the grids (matplotlib)

I want to remove the ticks on the x-axis but keep the vertical girds. When I do the following I lose both x-axis ticks as well as the grid.

import matplotlib.pyplot as plt
fig = plt.figure()
figr...

olive willow Jun 6, 2019, 10:22 AM

#

oohh yh that makes sense

rancid gust Jun 6, 2019, 10:23 AM

#

@sand reef thanks!

sand reef Jun 6, 2019, 10:24 AM

#

Np!

drowsy fulcrum Jun 6, 2019, 2:15 PM

#

📎 Figure_1.png

#

im trying to curve fit every point, anyone know how to do it?
essentially, i just want connect all the points up, but make the lines smoother
i could do this easily by hand, so it cant be very complicated
if it helps, the points will always increase in size

earnest prawn Jun 6, 2019, 2:24 PM

#

you just put all the points in an x/y array ant plt.plot(x,y) them

#

if you want a function which fits all the points I am sure you can find one with a lot of work

drowsy fulcrum Jun 6, 2019, 2:24 PM

#

📎 iu.png

#

this is what i need

earnest prawn Jun 6, 2019, 2:28 PM

#

this doesnt look it the graph fits the point

#

it looks like the points fit the graph

#

esepcially with the last one

unkempt helm Jun 6, 2019, 2:36 PM

#

It's just polynomial interpolation

drowsy fulcrum Jun 6, 2019, 2:37 PM

#

ok interesting. that lead me to this article:
https://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html

#

seems to be exactly what i need ty

knotty nexus Jun 6, 2019, 6:05 PM

#

any tips for reading multiple .xlsx files in a folder with column headers that are slightly off?

#

Assuming that I'm not allowed to modify the source files,
I tried something like df = pd.read_excel(source location + filename,
usecols = lambda x: x.lower() in usecols_lower_list) but it doesn't seem to work

#

then concat it together

sand reef Jun 6, 2019, 6:33 PM

#

so, are you only limited to using pandas? or are willing to try other libraries?

#

because, you could try this, I hope it helps

#

https://www.oreilly.com/library/view/data-wrangling-with/9781491948804/ch04.html

O’Reilly | Safari

Data Wrangling with Python

#

@knotty nexus

knotty nexus Jun 6, 2019, 7:12 PM

#

thanks @sand reef , I just skimmed the chapter, it doesn't really seem to contain specific information on handling files with different column headers, but it did give me the idea to just skip the header rows when reading in the files, then create header names afterwards. I think in my case the columns in the different files remain in the same order, so this would also work

sand reef Jun 6, 2019, 7:12 PM

#

np!

spice cargo Jun 6, 2019, 7:22 PM

#

I wonder if anyone could suggest me something to start in reinforcement learning. I have covered mostly its mathematical part and yeah some basics like GYM etc to implement Q learning.
I am looking for something solid to start.
beginner here

sand reef Jun 6, 2019, 7:32 PM

#

Well, I too have to begin reinforcement learning myself, I could myself use some pointers where to begin its mathematics from, got any leads for me? All I have done is the regular machine learning and deep learning courses from coursera.

spice cargo Jun 6, 2019, 7:35 PM

#

https://www.google.com/url?sa=t&source=web&rct=j&url=https://m.youtube.com/watch%3Fv%3DlvoHnicueoE&ved=2ahUKEwjJm-_8t6_iAhXRtlkKHYCeDLMQwqsBMAF6BAgKEAU&usg=AOvVaw31Hs8Zp0Dqw3iqQc8AAh-C

#

This could be the best tutorial to start

#

It basically covers all the prerequisites required for RL
like policy Grad,Actor Critic all basic information you need to start

#

Apart from that as above mentioned you can implement basic Q learning(You'll see in the tutorial) using GYM it is a toolkit for developing and comparing reinforcement learning algorithms.

#

I need someone to guide me from here...lol

sand reef Jun 6, 2019, 7:43 PM

#

Tysm!

west sky Jun 6, 2019, 8:37 PM

#

Does anybody know how to apply PCA to data with a large amount of features (170000+)? I am currently using sklearn but my computer crashes when I try to obtain a cumulative explained variance curve to determine an optimal number of components.

lean ledge Jun 6, 2019, 8:37 PM

#

No need to use a shallower CV based course to learn RL

#

Try Sutton's book, Berkeley's course and Spinning up RL

spice cargo Jun 6, 2019, 8:40 PM

#

What exactly does Spinning up RL means

lean ledge Jun 6, 2019, 9:26 PM

#

https://spinningup.openai.com/en/latest/

median siren Jun 6, 2019, 11:33 PM

#

Hi all, is there anyone who has experience with XGBoost? I'm trying to train a model which has X-value as a feature, but for pretty much every method the error goes:

 self._features_count = X.shape[1]

IndexError: tuple index out of range

Now, this makse sense since it's a single feature. So when I try to reshape my X_train value(e.g. using .reshape(1,-1), the following error occurs:

ValueError: setting an array element with a sequence.

Doe anyone knows whats wrong?