#data-science-and-ml
1 messages Β· Page 200 of 1
Yep I'm there
so vector V = -1I + 2J
It's -1 * [1, -2]
yes
Alrighty I got it
Got it. That's what I thought. Just need to watch the next video now
aV + bA
a and b are scalars
V and A are vectors to be more precise I and J hat for example
you know what the coordinates are of I and J hat
nwm
do you know how to solve a linear equation ?
Not sure
Oh yeah duh
Lecture 1: The Geometry of Linear Equations. View the complete course at: http://ocw.mit.edu/18-06S05 License: Creative Commons BY-NC-SA More information at ...
here you see how to use matrixes and how to solves equations up to Rn
yh sure but in my opinion this is the best one. I'm 14 and the guys can explain it so that even I know exactly what's up
What's up?
yup
Is this course on edx or something?
Come to India. Unless you're already in India, then you'll see what's bad explanation. They don't even want to teach and they are assigned as teachers.
Okay then. Habe phun.
It's not on edx or Coursera?
idk but I found it on YT
hhahaha
like people at my age in my school are just partying and stuff, me be like: I just like math and programming
i can't recommend strang enough
it's an excellent course for getting comfortable with matrices of real numbers
I'm using numpy to get familiar with matrixes and also solving linear equations
its good to do it on paper too
yh
I'm doing it on papaer
paper
not numpy but linear combies and equations
and also on my whiteboard
thats good
can someone give me an example of a 1d,2d,3d array?
yh
2 dimensions means a list inside a list
So [[], []]
Emm... I think so.
But that's pretty useless
And you already know what 3D is now.
but how would you represent a matrix?
like this?
[[4 6]
[0 9]]
and so this is a 2d array of number otherwise known as a matrix right?
Well, you should likely use numpy for this
Ah ok great
but I get this
Yes. a matrix is a 2d array
import numpy as np
import matplotlib.pyplot as plt
Array = np.array([[4, 6], [0, 9]])
Array2 = np.matrix(Array)
print(np.ndim(Array))
print(Array2)
plt.plot(Array[0], Array[1])
plt.xlim(0, 10)
plt.ylim(0, 10)
plt.show()
But you need to make sure it's layed out correctly
2
[[4 6]
[0 9]]
so that is still 2d
so 3d
nwm, I will learn how to write vectors and matrixes. this sht is hard
in programming. IRL it isn't that hard to do but in programming it's impossible
Haha
I'm 14 so I think I've some time hahahahah
Matrices are hard. Nevermind tensors
yup
It's impressive, keep up the euthusiasm
yh I know hardcore road to ML
sure dude thanks!
and let's not forget about indexing a matrix, not even talking about a tensor
So what in machine learning is represented by a 2d matrix?
Cuz I'm studying linear algebra and seeing this in Numpy I'm trying to apply it here
well every piece of data that requires more than one dimension like for example images have to be stored as 2d (or if you have rgb ones even 3d) data structures
and of course every transformation you apply to that piece of data also has to operate on those matrices
@stoic beacon
Thanks man
I can't decide if I want to use MLB stats to predict game outcomes or try the MNIST database. Both would be a NN
But if I go about looking for the MLB dataset, I wouldn't know what to look for or how to format the data. I know Kaggle exists but once I find a dataset there I get confused on what would be used for inputs or how to even decide what gets used for inputs
Hi all, I've posted a similar question in the r/LearnMachineLearning discord, but hopefully someone here can help me out, too.
I'm trying to follow the following tutorial:
http://linanqiu.github.io/2015/10/07/word2vec-sentiment/
However, when i try to replace the vectors in numpy.zeroes with my own embeddings, I get the following error:
ValueError: setting an array element with a sequence.
Does anyone have any experience with this and / or how to solve this?
Sentiment Analysis Using Doc2Vec - Linan Qiu
The error is quite explicit, it tells you that you are trying to set an element of a numpy array to a sequence, rather than a single value (scalar)
It's hard to help more without seeing the code
@stoic beacon MIT people learn the same stuff people from other universities do. They're not gods
I use online courses for MIT or similar for the majority of the stuff
The resources are just better
(I would assume that the error comes from you trying to set an element to a list or another array that holds the embeddings for a certain word)
Fair enough man. I always just assume MIT, Harvard, and the like are all harder since yaknow, they're for like...smart people and whatnot
Usually
Unless you have money but we won't go there
They might go slightly more in depth but that depth is good and still accessible. Where they really shine is specialised highest level (fourth year and grad) courses that both exist and are well done
Eg. MIT's underactuated robotics or Stanford's CNNs for deep CV courses are not something that are easy to find elsewhere
hello.... so there is a weird idea just popped up in my mind and I don't know which AI/machine learning libraries do I need if I want to achieve the following scenario...
Scenario 1 :
(The AI know nothing)
(The User's facebook account is public and has set the birthday)
AI:hi
User:I am John
AI:Hello, John
User:My facebook account is XXX
(the AI will now know the user's facebook)
AI: OK
User:How old am I?
(Then the AI will go to his facebook and search for the data)
AI: (give the answer)
Scenario 2 :
(The AI knows user is john now)
AI:Hello, John
User:What is the result of the barcelona vs liverpool on 8/5?
(The AI will now go search in google)
AI: Liverpool won and it is 4:0 (something like this)
Sorry...I know this kind of confusing...and thank you for trying to help me out...
i think thats basically what siri does
So Keras comes with the MNIST dataset but if it didn't how would you load that?
Since it's images
Also, side note: is TensorFlow hard to learn?
Also also, since a vector typically represents magnitude and direction how do vectors relate to machine learning?
I assume they're not talking about the same thing
In ML they are basically column only matrices. Vectors are just one column of values in ML.
Why are the called vectors, because when you represent a vector in n-dimensions, you can write it as:
ai + bj + ck +...
So, you can instead of writing i, j, k,...
Write as a column matrix
With [[a] [b] [c]...]
If keras didn't come with the dataset loaded, you would have to download the dataset manually. Check sentdex. He has a video on how to load dataset. It's the second video of his new tensor flow, keras tutorial.
Tensor flow is lower level than keras, but it's not that hard.
@stoic beacon
Can anyone explain why autoencoders are so popular compared to all the other models?
From what I've seen, it's not all that great in practice
Well, it captures representations pretty well.
Reduces the noise to a minimum, if not even removing it.
And is able to output as close as to the original input.
@void anvil Depends on what they're for. For vision-y tasks, there's a lot more detail involved in making good images so autoencoders on their own don't work well but they're a pretty simple and cool way of reducing the dimensionality of your data to a much more dense representation
autoencoders are now used extensively in NLP tasks too.. look up BERT..
for capturing context aware representations of words.. ergo word to sentence embeddings
also.. dont mind me making nlp sound sexy.. it's not.. it's mostly mind numbing work and lot of Lisp :v.. I should've stuck to image processing..
Im trying to make sense of some numbers
consider I computed correlation of x vs a set (t, u , v, y, z )
what I stated above was cosine similarity..but apparently it's the same as pearson correlation coefficient
for centered vectors..
naw NLP is great
anyone have experience with lowpass/highpass/bandpass filters on digital audio samples?
@karmic geyser me
why
Spent this entire semester + possibly the next one if I take the DSP elective
You've been typing for a while π Scared of how long the question might be
I'm using sounddevice in python to get audio input then output it with low latency. I want to turn a stereo audio input into 5 or 6 channels which I then will output to a subwoofer, midrange speakers and then finally tweeters. pretty much I'm trying to do a 3 way crossover in software. for an example say I have an audio stream at 44.1khz sampling rate and I have 1024 samples in an array would I need to add some kind of delay of like 30 samples or so. If I wanted to reduce the volume of everything below 3500hz by like 6db an octave? Also what books/online would you reccomend to do basic stuff like butterworth filter.
@lean ledge
Why do you feel you need a delay of 30 samples? For notes, I would study from MIT's 6.007 Signals and Systems, lecture notes here https://ocw.mit.edu/resources/res-6-007-signals-and-systems-spring-2011/lecture-notes/, in particular, butterworth filters here https://ocw.mit.edu/resources/res-6-007-signals-and-systems-spring-2011/lecture-notes/MITRES_6_007S11_lec24.pdf
The delay would be so when I go from 1 chunk of samples to the next chunk the filter would still be smooth.
30 samples was arbritary but I don't know how many samples a normal filter would use.
maybe not so much a delay but memory of the last 30 samples or 30 processed samples given to the filter for each channel.
Thanks for the lecture notes, they are quite good. I don't have much experience with filters or reading and understanding university level math. I tend to understand math better if it's written in a programming language.
When you're done with signal processing basics there's https://ocw.mit.edu/resources/res-6-008-digital-signal-processing-spring-2011/ for focus on digital signals and then https://www.coursera.org/learn/audio-signal-processing for focusing on audio signals
This course was developed in 1987 by the MIT Center for Advanced Engineering Studies. It was designed as a distance-education course for engineers and scientists in the workplace. Advances in integrated circuit technology have had a major impact on the technical areas to whic...
I'm not sure I can help you with audio processing chunks because while i've done a bunch of signal processing, it has been in context of signal theory rather than the details of specific implementations but I believe you're looking for techniques involving Hann and Hamming windowing functions and hen merging on top
I'll just say that if you use a technique such as IIR filtering rather than something IIR based (like butterworth filters), you might be able to ge decent filtering without worrying about dealing with merging the output of separate buffer frames
Uh what do you mean?
I feel like you would pass your samples through a windowing function and it would let you know how much activity is going on at a certain frequency bandwidth. and you would just repeat that say 1024 times with different bandwidths and use the output to make a spectrogram.
lower the volume of frequencies below 3500hz on a continuous stream of digital audio samples.
with a curve so the lower the frequency the lower its volume is.
@karmic geyser To skip the theory for you, construct a butterworth filter with the parameters you want (it's simple with scipy), then take advantage of the fact that butterworth is IIR and use the zf value that the lfilter function returns after you use a filter and pass that in into the next filter operation when you run the next batch
I tried that but it didn't seem to be working. It was like it was just lowering the volume of the entire frequency spectrum.
Or you can use a pre-built system like GNU Radio with Python to set up the streaming architecture for yourself
@karmic geyser Every filter will reduce the volume to some extent
it shouldnt be by a lott
should be disproportionate
very disproportionate
you can always reamplify by multiplying by a constant as long as the wrong frequencies are filtered out
if they're not filtered out, there's probably something wrong with your parameters
let me just quickly upload the code I have. It was ment to be a 6 order butterworth but it was making the signal inaudible. if I multiplied the signal by like 2048 I could hear it again it mostly seemed to be the same frequencies but with some distortion from compacting and expanding the samples.
How did you get the parameters?
You can use something like PyFDA to come up with the perfect filter https://github.com/chipmuenk/pyFDA
v good for filter design, I love it
Anyways, i'm dead tired from studying for my signals course, this isnt helping much :p I'm gonna go take a rest
Good luck!
hopefully the resources I linked can help a bit
Oh okay. this is my code.
line 10-16 is the filter values 18-22 applies the filter, 53-63 is where I actually pass the data
There are links to tutorials in the official documentation: https://pandas.pydata.org/pandas-docs/stable/getting_started/tutorials.html
There's also a 10-minutes to Python tutorial in the official documentation of Pandas
thank you @lyric canopy
When using Colab, where do you save CSV files to be read in
yo guys!
Morning
howdy?
hahahah hwry?
Cuz you're not a cowboy lol
hahaha
π€
@stoic beacon depending on how big it is, you can save it locally or on cloud storage..
Awesome thanks
guys on what do you need calc in data science? just curious
optimization
understanding and computing gradients is really important
understanding at least how and why convex optimization works is important
also finite series come up a lot, that's usually covered in calc courses even if it's not strictly calculus
quick question: for pandas, I have a dataframe where I have a timestamp column, an ID column, and another column category. In some cases, three rows can have the same ID and timestamp, but three different categories. Is there an easy way to drop all rows where this happens except one of them?
follow-up: This is not important, it doesn't really matter which row I keep since multiples is a good sign, but I have a 4th column snr which I could use to select which one to keep, i.e. keep the row with the highest snr value
@craggy geyser
Take a look at the following code (copy pasted from stackoverflow). It creates a dummy df and drops all rows where the values in the A and C column are not unique. It keeps the first non unique row.
import pandas as pd
df = pd.DataFrame({"A":["foo", "foo", "foo", "bar"], "B":[0,1,1,1], "C":["A","A","B","A"]})
df = df.drop_duplicates(subset=['A', 'C'], keep='first')
If you want to keep the row with the highest snr value and you don't mind changing the order of your df you can sort on snr to begin with before dropping.
ah, I see, by feeding it the columns, It will drop duplicates where the pair of those two columns are the same. That makes sense, and when you write here now I think I actually have done this in the past, should have remembered
and yes, that makes sense with snr of course
thanks!
ππΌ
np π
so yh I'm learning data science and am thinking about buying a course, do they really cover at least the most of the stuff you really need to know?
because I'm thinking of buying the datacamp subscription
is it any good or are there better courses
why not do a free one?
like fast.ai
that's machine learning focused, but machine learning is a fine place to start nowadays for more general data science
isn't fast.ai significantly deep-learning focused?
(I think it's good, but not sure if it's the best recommendation for general data science.)
yeah it is
hes also 14 π
if you already know how to code, i don't see a problem with starting with deep learning
oh, yeah then disregard data science, acquire AGI skillz
especially if you're doing it as a hobby
if you don't fall into the "arrogant AI guy" trap then you should be fine transitioning into general data science
deep learning would've been such a blast if I could've started earlier
you can learn probability and stats later one you know the math and coding
instead I was learning javascript before javascript became good
heh
Deep learning should be learnt after ML
For one, deep learning is mostly useless and bad
Whatβs deep learning for
Facebook begs to differ @lean ledge
For another, it's easier to learn how to treat it like any other model when you know how other models work
I don't think it's mostly useless and bad if you use it on places where it's clearly good at. But don't use it to predict sstock prices
facebook, google, openai, et al
How so?
Research β practice
I am very very aware of ML research, I assure you
Deep learning excels in a few tasks but in practice as a data scientist, you almost never use deep learning
of course
hence why disregard data science, acquire AGI skillz :p
Deep learning is the way to do CV and NLP but apart from that there's few uses for it
which are huge problem domains right now
actually though, as a 14 year old, deep learning will be much more fun/better as a hobby than learning how to do pivot tables
at least 50% of the data science jobs i see are either CV or NLP or audio related
if this were a first college course I'd say yea go learn some statistics first
πππ we must be seeing very different jobs
unstructured data is the big data of 2019
its a fad in some regards
but in others its a genuine big step forward
but if he's going to have fun with CycleGANs and make cool pixelated pokemon recolors I say go do that
CV and NLP are a minority of data science jobs and they require a large large amount of specialisation for the average job
You basically have to spend an year learning just deep vision after having already studied other DL and ML stuff in order to catch up on SOTA
thats fair
Where are you looking that half the data science jobs you see are CV or NLP or audio related?
my recommendation was targeted at a bright kid who's already good at programming and math, and wants a place to get started
@reef bone maybe in the wrong places
Yeah I rarely ever see a CV or NLP job lol
I'm genuinely wondering because I rarely see those at all
anyway i wouldnt have made that recommendation to anyone else
The few CV jobs I see are specialised robotics related jobs
it's like telling a kid "Don't learn javascript, start with learning big O and data structures"
DL is an approach to ML, I don't think you can really learn DL without ML
and frankly im only resisting this at all because your tone was confrontational
unnecessarily so imo
tl;dr use CycleGANs to make new pokemon sprites, but also read Murphy
^
I definitely think classic ML should be learnt before DL. It makes people too comfortable trying to use DL because that's what they're used to. It builds weak foundations in ML to start at DL.
i agree, for anyone over 14
I 100% agree for people getting serious in the topic
I disagree for a hobbyist wanting to pick up something new and cool
its like saying to learn what a hash table is before using turtle graphics or pyqt5
I s2g GAN SOTA changes faster than the hottest JS frameworks
lol feel like we're talking past each other at this point
(I was not referring to learning SOTA, just joking about the GAN hype)
anyway if you have a recommendation for a free data science course that isnt fast.ai, i'm sure the person who asked the question originally would appreciate the recommendation
and i would too, so i can recommend to others
oddly enough, Jeremy Howard probably would have been great for a datascience course
afaik that's his background
Honestly though @lean ledge the all caps nick is making you look more upset about this then you probably are π€
There are many other than fast.ai. Andrew Ng's course, Columbia's ML course (my preference), Google and Microsoft have their own free ones, etc.
Probably tbh
when people talk about Andrew Ng's course, are they still talking about the coursera one? or deeplearning.ai
honestly even pre-DL boom, I never liked his coursera course
I generally assume coursera unless proven otherwise
I didn't like it either
It's too shallow
And too much "don't worry if you don't understand" going on
also octave lol
That too
my main gripe with fast.ai is that that group is incredibly self-promotional
the content is solid though (if a little bit loosey-goosey)
i didnt like the coursera course either
not only "dont worry about it"
but also octave π€’
3/3 surveyed people hate the coursera course lol
and teaching linear regression with gradient descent was always weird to me too
yes!
n o r m a l E q U a t I O n
I liked it but then again I already had a background in statistics and maths and learned to program in Matlab so maybe it was intended for me.
we've concluded that you need to read SICP but also He et al. 2015
i didnt know columbia had a free ML course
It was too boring for me. Not Mathy enough. Columbia's course was much nicer and when I looked back on it, I realised the topics were so practical. Stuff I use or see being used all the time in real world DS
i wonder what a more general purpose "data science" course would look like
vs a "ML" course which is what i normally see
data science is whatever you want it to be
well sure. i assume it'd spend more time talking about probability and stats, as well as data visualization
domain-specific business logic? sure!
hardcore data-engineering principles? why not!
ELBo for building VAEs? throw it in!
maybe start with stats and do ML at the end once they're a little more comfortable with the math and coding
Look at how good that syllabus is!
which syllabus is this, columbia?
Yah
that's pretty comprehensive
and fast moving
what are the pre-requisistes?
it's free online? that's pretty sweet
that's more ML than DS though?
DS is basically ML though? + Domain knowledge and blah blah hype words
sorta, ML+Stats
- you actually have to talk to the business people
at least that's my expectation when i see "data science"
I do actually think some level of data engineering should be in data science
databases, mapreduce
does anyone actually write low level map reduce stuff though nowadays
people just use spark
unless you truly have enormous data
would still be good to know the underlying principles though
oh, the concept? yeah definitely
But yeah it's a good course. Builds good fundamentals in the first 6 weeks and then goes over good foundations for related stuff like drawing out true latent factors through matrix factorisation and PCA, Markovian models, continuous state space extension to those, etc
i think that more falls under general programming skills than data engineering though
yeah, ill go look over the material at some point and start directing people there who ask
thanks
arguably even (very practical) things like database integrity when you have parallel requests
don't have to know how DBs actually handle them, but you need to know that it's an issue that people have to think about
I would argue that data science should cover how people have to handle data
of course, there're different perspectives
like my bayesian stat friends who treat deep learning as just "function approximators"
which isn't wrong
im one of those people i think π
i also think that data engineering generally can be learned on the job
and i think a good org will deliberately get you to push your limits in that regard
btw guys do you need like linear algebra and calc do start learning ML and understanding it ?
you can start, but you won't get that far
you might also build up some bad habits and mistaken ideas without knowing how it works
for understanding it, they are necessary
so even before using linear regression, I should understand the formula
sht
that will take hella a lot of time
I'm on linear transformation rn
and calc == noting yet
so it will take like 2 years right? around that
to understand the a little bit more than the basics of ML
My 2 cents, at 14 I would play around with what I found enjoyable. If you put up a rigorous schedule for learning DL and/or ML and all related fields, you might find you are sick of it after two weeks and then do something else. Playing around and learning stuff in a sub optimal way is always better than giving up on learning it the right way. Playing around with a conv net for CV even if you don't understand everything that's going on, is much better learning then reading the intro chapter of some advanced calc book and then putting the rest of it away forever. If you enjoy it you'll find yourself learning the whole field soon enough. So first of all know yourself and then figure out what you want to learn.
I enjoy it that's why I'm so enthusiast about it and want to learn it.
I'm repeating it every day because, it might sound weird to some people but I love math and things that make sense like chopping a circle down and then putting it into a graph and getting the cm^2 that way for example
in stead of using the normal pi r^2
sry dude bye have to go to sleep now, will read it tomorrow !
All of this has hurt my head
And just the fact the it's so vast and there's so much to know I think I'm just done lol
It's too much math that's way over my head and spending 2-3+ years to learn something to just the point of basic understanding is just not my idea of a hobby lol
@olive willow no for basic 1 variable linear regression you can get by with basic calculus if you want to really understand
you can skip that honestly
this is why people usually go to school for years when doing this stuff..
@desert oar what about me bruh
I need wisdoms too
Where can I best understand the high level principles behind ML and the algorithms involved? Not trying to become a math wizard or scholar or anything
what do you already know, and what are you trying to do with what you learn?
Just get a better understanding so you can follow the news and not be completely lost?
Or fit basic models?
Fit basic models sir
Maybe do some fun things with some Kaggle data or work related data
Ok
Im really not the best one to ask, i dont know too many resources
kaggle has their own tutorial content but its very limited
Yeah I saw :(
Oh well
I'll just keep gaining bits of info here and there as I watch things
Say. Is reservoir computing still a big thing? Or it never was?
I have some python code and I need to do a lot of maths on a lot of data inside a function in almost real time. How would I normally go about making it faster? I have already rewritten most of it with performance in mind as well as running multiple of the function on different signals in seperate threads/processes. This is what line profiler says about the function. ignore the thing about bandpass as it's not actually a bandpass yet.
It's currently using about 3.3 ghz of total cpu usage and I need to try get it to under 1ghz
would you use something like cython? I haven't used it before.
Well. About cpython.
What's all this fuss about Python and CPython (Jython,IronPython), I don't get it:
python.org mentions that CPython is:
The "traditional" implementation of Python (nicknamed CPython)
yet another
Apparently PyPy, not PyPy3, is super fast. Faster than cpython.
@karmic geyser
Cython was probably the thing that was meant
Hey sorry I didn't have it open.
I could maybe use pypy instead of cpython. But I don't mean using a different interpreter I mean having a single function written in C or C++ code and called from python.
Yeah I'm running into performance issues on my desktop pc and eventually want to run it on a 1ghz single core arm processor. most of the code is fast enough it's just a few filters that I will be applying to a big array of data that might need to be done in c++. How hard is it to use cython for a single function?
It's designed for that, so hopefully not very hard
Okay I will give it a shot.
Like a tutorial?
Python Programming tutorials from beginner to advanced on a massive variety of topics. All video and text tutorials are free.
Thanks I will have a look at that.
@karmic geyser If you have time I've also heard nice things about https://github.com/pybind/pybind11 for when you need just one function running in c++. I've never tried it out myself though.
alright I will read about that too. I think I used ctypes for something in the past because I wanted to access some windows dll functions and python didn't have an interface.
Cython's main difficulty is sparse and IMO somewhat incoherent documentation
If you know C it's probably easier to learn
You're just trying to wrap a C++ function?
I have an algorithm in python but it's quite slow so I wanted to make just that algorithm function run faster.
I have about 23ms to run the algorithm on 1024 values. I think it would be a lot faster in c or c++.
If you share the code i can probably help
Chances are there is something you can do to improve performance without using cython
But yes, when I rewrite something in cython i usually get about a 50% performance improvement without doing much of anything other than copy and paste
Yo dude
How long does it take currently?
You can use libs to try make it faster ? Like numpy array instead of list
I already made some changes. Instead of using a circular buffer for some values I only store the last values of the chunk for the next one. I'm also running the algorithm on 2 different cores but that won't help when I move it to the embedded device. at the moment it uses up about 70% of a 4.5ghz cpu core. and I need to run it on a 1ghz arm cpu.
Well unless you post the code nobody can help
Are you operating on images? Text? Etc.
What do you mean by a buffer?
You're trying to iterate over something in chunks?
Audio data. maybe I ment ring array. let me post the code + the profiler
Thanks, it will just be much easier to assess the situation that way
OK, I think I have some performance improvements we can make once I get to a computer
The algorithm at the moment pretty much just acts kinda like a low pass filter. I will need a few different algorithms but I'm still learning how to implement them. Python seems to be kind of slow and I'm not really sure how to optimise python code, with other python stuff I have done it hasn't been a problem but I think that was because libaries I used had the heavy stuff implemented in c or c++. I don't mind writing the actual algorithms in c++ or c I'm just not sure what is the best way to do that and then call it from python.
I think the algorithm I wrote might just be a moving average
Yeah. Also looping is slow because of lots of memory allocation and other overhead
yeah, I figured if I wrote it in c++ I could avoid most of that.
If youre using numpy for looping you might as well use a list
Ill take a look but yes. This might be a good candidate for cython
Some stuff I might need to do is multiply every value in the array by a constant which I think numpy helps with. The audio library I use for outputting and recording the samples gives a numpy array as well.
Thats a 1 liner in numpy and extremely efficient
x = np.array([1,2,3])
print(x)
y = x * 10
print(y)
'''
def stereotomono(left,right,gain):
left = left * gain
right = right * gain
mono = left + right
return mono
'''
Woops, but I was doing that to multiply an array to lower or increase the volume and then I was summing the 2 arrays.
Yep that should work
In your algorithm you loop over bpos twice for every pass over samples
Oh nvm
Hah yeah this is a moving average isnt it
pretty much I'm adding a few of the previous samples to the start of the algorithm.
Yeah I think so.
on the left is the effect it had on some music I was playing. the right is with the filter turned off.
https://stackoverflow.com/a/44797397/2954547 for a non numpy version
Numpy version https://stackoverflow.com/a/14314054/2954547
The comments on the numpy answer are enlightening as well
Pretty much I want to get an audio signal and turn it into 3 audio signals. 1 that is lowpass below 150 hz. 1 that is bandpass of 150-3500hz. and 1 that is 3500hz highpass.
I tried using scipy butterworth filter but the tutorials were not really clear and it didn't seem to work correctly. it was just making everything quiet.
can you share your scipy code?
the example here looks straightforward enough to me
from scipy.signal import butter, sosfilt
freq_lo = 150
freq_hi = 3500
filter_order = 6
sos1 = butter(filter_order, freq_lo, 'lowpass')
sos2 = butter(filter_order, (freq_lo, freq_hi), 'bandpass')
sos3 = butter(filter_order, freq_hi, 'highpass')
def filter3(y):
return sosfilt(sos1, y), sosfit(sos2, y), sosfit(sos3, y)
then you'll have to play around with the cutoffs and order in order to get the response to look how you want
import matplotlib.pyplot as plt
from scipy.signal import butter, freqs
from collections import Iterable
def plot_butter(N, Wn, btype):
b, a = butter(N, Wn, btype, analog=True)
w, h = freqs(b, a)
plt.semilogx(w, 20 * np.log10(np.abs(h)))
plt.title('Butterworth filter frequency response')
plt.xlabel('Frequency (rad / sec)')
plt.ylabel('Amplitude (dB)')
plt.margins(0, 0.1)
plt.grid(which='both', axis='both')
if isinstance(Wn, Iterable):
for w in Wn:
plt.axvline(w)
else:
plt.axvline(Wn)
plot_butter(6, 150, 'low')
plt.show()
#I think this generates the values that the filter will use.
def butter_bandpass(lowcut, highcut, fs, order=5):
nyq = 0.5 * fs
low = lowcut / nyq
high = highcut / nyq
sos = butter(order, [low, high], analog=False, btype='band', output='sos')
return sos
#I think this applies the filter to the data.
def butter_bandpass_filter(data, lowcut, highcut, fs, order=5):
sos = butter_bandpass(lowcut, highcut, fs, order=order)
y = sosfilt(sos, data)
return y
what's the nyquist frequency? i don't know much of anything about signal processing
fs was 44100, lowcut was 150 and highcut was 3500
fs = frequency sampling rate?
nyquist frequency is half of the sample rate. It's pretty much what a sample rate can have a sine wave up to.
I had the order at 6 but it was making the entire song quiet. if I turned the order lower it just seemed to be making it slightly less soft. if I multiplied it by like 16384 it ended up sounding close to the original but with some distortion from compressing and then expanding the samples I think.
so your data is sampled how many times / second?
just so i understand what's going on here
44100?
44100 times
I pretty much have a microphone/ line input that I am playing music from my phone through. I then grab 1024 samples on my computer. apply some processing then output the 1024 samples to my computers headphones. the latency is somewhere under 0.1 seconds. all this is happening in real time and constantly streaming from an input to an output.
I think so.
ok
a 1 hz signal would pretty much be a sine wave that repeats every 1 second.
well anyway does your hand-written code work?
its just slow?
cause i feel like we can definitely get the butterworth filter working, but yes you can probably write your code in cython for a significant speedup
can you give an example signal i can test with
I can open a wave file, apply the processing then output a wavefile with some python libaries I think.
ok, thats not really what im asking
but if you find a signal processing library that's probably the best option
I can't really give the same samples as I'm streaming it from an input. I can program some stuff to read/write the samples though.
any kind of test data should work
how are you testing your code in the first place?
anyway, in your code there are a few things that can be optimized
the variable n is completely unnecessary, it's always just equal to bpos_max by the time you use it
im still not totally sure what x is but if it's a sliding window, then you can obtain that much more efficiently
also i += 1 should be ever so slightly more efficient than i = i + 1 which might matter on an embedded device
yeah I was doing some stuff like adding 2 values then dividing it by n then resetting n back to zero. x is the summed value of the previous "bpos_max" samples
Yeah I wanted to do i++; but python doesn't have that haha
N is redundant with how it is atm.
pretty much I'm adding the previous chunks 16 samples to the start of the algorithm I'm summing samples 0-15 + the current sample then dividing it by the total samples which is 16. I'm then moving an offset by 1 and repeating it again up to the total number of samples in the chunk. I'm then setting the buffer to the last 16 samples of that chunk
so what happens when you're at the 3rd sample in the signal
you don't have 16 previous samples
the summed + divided value is then stored as a sample in an array to be returned
the first time I run it the 16 samples are all equal to 0.0
got it
use the moving average code i sent
in the stackoverflow examples
as long as you aren't extremely memory constrained it's the most efficient option
it pre-computes the entire sequence of cumulative sums
then subtracts off whatever is outside the window
yeah, I'm only cpu constrained. I got 512 megabytes of memory on the device that will run the code.
I think it's only like 35 kilobytes of samples.
alright let me think about this
It's ~1.5 megabytes a second of samples but I'm doing it at roughly 43 chunks a second so I don't need much memory. It's mainly a lot of operations.
yes you can rewrite this in cython and should see significant improvements
Ideally I would use someones library, put the filter values in then they would probably do it in c++.
Like what numpy and scipy probably does right?
yeah. or fortran π
Haha why not verilog π
hmm im confused as to how this buffer is working
Can I jump in to ask a dumb and unrelated question?
sure
Always
@karmic geyser it looks like you never update the buffer contents until the end of the loop
Yeah I don't need to update it until the end, it was a small optimisation I thought of haha.
i = 0
x = 0
n = 0
for cur_sample in in_data:
for sample in buffer:
x += + sample
n += + 1
buffer.replaceOldest(cur_sample)
new_signal[i] = (x + cur_sample) / n
i += 1
x = 0
n = 0
return new_signal
then how is the buffer being populated
that was the old code before I did some optimising and maybe changed it.
arent you just pulling the first 16 values all the time?
Been watching some TensorFlow videos and I honestly have no idea what's happening. I've watched enough high level videos and read enough articles that I generally understand how a neural net works but TensorFlow code is confusing me to shit. That being said, would Keras be sufficient for any stupid project I want to do? I don't need to do anything super scholarly or hardcore or like...cutting edge. Would the simplicity and ease of understanding of Keras be better?
probably sufficient
but what part of tensorflow is confusing?
like... what's the most sophisticated code that you can understand?
also are you sure you understand how a NN works? i dont mean to be confrontational, sometimes we think we know more than we do
Just the general workflow in terms of creating the actual net. What are Placeholders, what are Variables, what's a Graph, how do you create the layers, etc
do you know what backpropagation is,what gradients are, etc.?
I understand that there are input neurons which hold your data and the connections hold the weights and the hidden layers perform some activation function on your data
ok... you'll need a more technical understanding than that in order to understand tensorflow
which i highly recommend developing. but for now keras is probably more friendly for your use case
Then some calculus is used to find the minimum point, a la gradient descent
tensorflow is really a "differentiable tensor computation graph engine"
for which NNs happens to be the most immediate use
I gatcha
I must've watch that 3b1b video on NNs three or four times and yeah I clearly already forgot the big parts
Watched
yeah if you don't feel comfortable with the equations, you will struggle to make TF work for you
I was able to give a better detail a few days ago
layers are kind of an abstraction
So TensorFlow really kinda forces you to understand the math?
Not to sound like an ignorant fool or someone who doesn't want to learn it all, I just simply don't have the time
So something that abstracts away some of that math is probably best
the buffer keeps its state between function calls. Think of it as me remembering the last "n" samples if I have all the samples in memory and I know what position I am up to then I don't need to write anything to the buffer until I am done with the current chunk then I just save the last few values of the chunk that the next function call will need to smoothly apply the algorithm, I could replace the oldest value in the buffer with the newest sample and shift the index by 1 every time but it's just not needed and slower then the way I switched too even though it was easier to read.
What do you want to do with machine learning?
@stoic beacon
yeah it does @stoic beacon
@desert oar I had a feeling haha. The series I was watching had him creating out the actual z = xw + b and I'm like...wut
@karmic geyser oh just stupid work things. I try to self improve every so often and I pick a topic I'm interested in and try to learn some of it
Without going too deep into any one thing. Jack of all trades, master of none kind of thing but I'm okay with that
@karmic geyser im confused as to what your code is doing then
I enjoy Python so I picked ML to practice some Python while learning something that interests me
@karmic geyser how many elements is filterdata operating on at once?
if you can explain your algorithm in words it might help
You can probably find tutorials for tensorflow with stuff like "tensor flow character recognition" in google. I think there was a white to black 32x32 pixel thing that you trained to recognise letters.
And following a tutorial is great but it wouldn't be self improvement if I just blindly follow a tutorial and can't understand it
precisely
Even if I just loosely understand what Keras is doing I'd be happy haha
Even if I have to use the words "magic" and "awesome maths stuff"
And I don't usually put "awesome" and "math" in the same sentence
pretty much make a 32x32 pixel image with a character in it. apply some kind of distortions/blurs. have like 10 different ones for each character. you use that as your training data. it then sets weights of "neurons" so that it gets as close as possible to 100% correct guesses of your training data. pretty much the "neurons" will find patterns in the data based on the intensity and position of values and how they compare to ones adjacent to them, it will then "guess" at what the character should be.
you could do stuff like trying to centre the character in the image before your program guesses it as that could improve the accuracy.
a neural network pretty much compares the inputs to each other to come up with an output. you tune the neural networks parameters/shape/size/weights/how inputs are linked ect so that the output gives you the output that you want most of the time.
That's supervised learning.
@desert oar filterdata operates on an input of 1024 floats in an array, it returns 1024 floats It has an internal memory that stores the last 16 elements of the previous array it was passed and uses those to continue from where it last got up too. the output samples are 17 samples added together then divided by 17, it then offsets everything by 1 to generate the next sample. first sample uses all 16 values from buffer + 1 from input array. second sample uses the newest 15 samples from buffer + 2 from input array. until it's using 0 samples from buffer and 17 samples from the input array. from that point onward it then just offsets 1 by 1 along the input array calculating a sample from 17 samples until it gets to the end of the array, at that point it saves the last 16 values of the array to the buffer to use next time.
yeah but you're always just using the final 16 elements from the previous array
there's no window sliding over the current array
also why are you even chunking it up like that
if your whole array can fit in memory then just do it all in one pass
while i < (samples):
while ii < bpos_max:
if (ii + i) >= bpos_max:
x = x + in_data[i+ii-bpos_max]
elif (ii + i) < bpos_max:
x = x + buffer.getValue(i+ii)
that starts off with the 16 values from buffer then 15 then 14. slowly adding values from the regular array.
but where are you adding values
ohhhh
oh i see
ok
err why are you using this circular thing at all then
vs just storing the last 16 values in a regular old array?
the if and elif are swapped around cause I figured that 98% of the time the first if statement is true.
also you can just do else instead of that elif
yeah the else also works.
so why not just store the last 16 elements of the previous array?
why this ring buffer business?
Originally every sample was going into the ring buffer haha. Then I realised I didn't need to do that. I will replace it with a list
also why do this in chunks of 1024
instead of just.. an array
oh cause youre reading them 1024 at a time?
yes. It's going to be a digital crossover for speakers
1024 samples is enough that I should be able to do most algorithms and the delay isn't too much. Also less overhead then if I were to do it in chunks of 64 ect.
slow computers can't handle low chunk size as well. it adds stuttering to the playback
I'm going to have a small linux device that recieves an audio signal via spdif or 3.5mm stereo jack, then output it to 3 different 3.5mm jacks with different processing based on what kind of speaker it is. Only send bass to the subwoofer. high frequency to tweeters ect. if you send too high an amplitude signal at a low frequency to a tweeter it will break it. and if you send too much high frequency stuff to a subwoofer the bass will not be as clear.
if you send me some sample data i can test my implementation
sample inputs and outputs
Pretty much my $350 speakers amplifier/subwoofer broke and it was proprietary. They don't make them anymore. I managed to find someone who was selling the exact same speakers cause they had the same problem and I got them for $30.
Yeah I'm writing a python script to read a wave file and pass it in chunks
guys so tensorflow is for ML and keras DL
yh
btw what do you need for DL
?
ML, math, programming, understanding of data. and what more?
Understanding of the industry/field that you are trying to use deep learning with.
I'm not in an industry yet but I want to go e-commerce or social media
social media it would probably help to know a bit about fake accounts and how to identify them to clean up your test data. e-commerce maybe it would help to know what data to collect on customers, if in doubt all of it that is legal haha
I'm not in the fields I was just trying to think of examples of things that would be industry specific knowledge that is relevant to the problems.
so do you use ML or DL for targeted ads?
I would say deep learning is still machine learning. It's just a catergory of machine learning that is more complicated.
Generally with machine learning you want it to be as simple as possible to get the result you want/are looking for.
yh not to over do it especially with data science
do you do predictive analytics with ML or just an algorithm
more complicated makes it harder to train, harder to understand, harder to predict ect.
yh
predictive analytics is something you do.
ooh sure
machine learning is one tool you can use to get there.
yh that's what I meant
can you give me an example of uses of ML in data science like for example the analyzing part
an example of predictive analytics might be this
what do you mean by the analyzing part?
so if you already have pre processed the data and you start analyzing it
Most machine learning data is pre processed before you give it to the machine learning code.
yh but what can you do with ML in data science? like group data that is coming every second for example a red flag of a bank transfer
oooh sure
basically "automated predictive modeling" = "machine learning", "one-off modeling or causal analysis" = "statistics" π€·
the distinction is more one of application nowadays
the methods are different though
eg you wouldnt typically use a random forest to infer the distribution of cancer cell sizes
what's modeling I've seen it sooo many times but don't really know the definition
yh I know that
nor would you typically use a hierarchical bayesian model to run online fraud detection
unless you could optimize it for such a purpose
actually thats kinda not true, you totally could
but you know what i mean
modeling is... fitting a model
implied, with the purpose of capturing some truth about the world
rather than "just" making predictions
but you would use a linear regression model for example with sensor data to see if it didn't overheat for example and give out an inaccurate measurement
sure
linear regression kind of sits at the intersection between "machine learning" and "statistics"
Okay say you have some basic maths algorithms and you are a bank and you use it to work out whether you should or shouldn't give someone a loan. you can then keep data on all the loans you have and use machine learning to try find patterns in people who defaulted on those loans. you also could give loans to people that don't quite pass the basic maths algorithm to get more data and then from that use machine learning based on whether they defaulted on the loan or not.
yh
what rayzar said, but note that said "machine learning" algorithm could well be based on a statistical model
but what's a model??
but why male models?
lmfao
I have been saying that for a while and you are the first person to get it hahaha
fun fact, that bit was ad-libbed because Stiller forgot his next line
You can't script that kinda stuff haha
stupid question here, if I have a numpy array and I want to fill it from the start with data until another array is empty how would I do it?
array from shape (211,2) into shape (1024,2)
Stochastic gradient descent is a cost function right?
"Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties" I think so.
I think it might be a part of minimizeing your cost function?
I'm not too sure sorry.
SGD is a convex optimization algorithm
just like newton's method, conjugate gradient descent, L-BFGS, etc
@desert oar You know how I can put a small numpy array into a bigger array. pretty much its shape is (1024,2) for the big one and (<1024,2) for the second one.
I want to put it in right at the start.
Okay I got it sweet.
@karmic geyser what order? you want to insert them rowwise?
try this instead
outdata[:view.shape[0], :] = view
oh woops thanks I made a typo and didn't notice
@karmic geyser https://github.com/gwerbin/python-discord_signal-filter
I have almost done the wave file thing for you.
its fine, i wrote it untested
i needed to learn how to use arrays in cython properly anyway
let me know if it works for you
or if it breaks π
haha, i will give it a try.
you should be able to use it with from speaker_filter import filter_signal
@desert oar I sent you the files. It's 2:30 am so I will get some sleep then look into using the filter you did in cython
# Contouring and Plant Detection
cv2.imwrite('saved_mask.jpg', mask)
reuploadedImage = 'saved_mask.jpg'
preBlobD = cv2.imread(reuploadedImage, 1)
# cv2.imshow("Mask", preBlobD)
font = cv2.FONT_HERSHEY_COMPLEX
start = 1
for i in range(48):
lower_value = np.array([0, 0, 0])
upper_value = np.array([180, 255, 255])
blobDetection = cv2.inRange(preBlobD, lower_value, upper_value)
# Contours detection
contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours:
partLabel = "{}".format(start)
area = cv2.contourArea(cnt)
approx = cv2.approxPolyDP(cnt, 0.02 * cv2.arcLength(cnt, True), True)
x = approx.ravel()[0]
y = approx.ravel()[1]
if area > 80:
cv2.drawContours(preBlobD, [approx], 0, (0, 0, 0), 1)
if start >= 49:
break
#Try to find the Center Here
if len(approx) > 2:
cv2.putText(preBlobD, partLabel, (x, y), font, 1, (255, 255, 255))
start = start + 1
cv2.imshow("Mask", preBlobD)```
ops
So i have this code that i wrote
and write now it functions to create countours of the masks (of another image)
ideally i want to get a list of all the x plots and y plots of each contour vertex
i need this because i want to take the averages of it so that I can find the center point of each plant (the subject of the masks)
does anyone know how and if thats possible
sorry bro i can't help you, ask maybe nix
@earnest prawn
why am i always listed as a reference for this i have no idea about data science
actualjly my mentor for my internship just called and i have to go to a meeting π
cya ill be back with the same q :3
@earnest prawn because you're an smart boy
i am smart to the extent that i can use google (at least for this topic)
Does anyone have any suggestions
@olive willow @earnest prawn sry bout ping i try not to
IDK i'm 14 dude still learning even the math hahahhaha
hhahahahahha
ill be back i need to 3d print something. id appreciate if seomeone could help me with this. its a gate to moving on in my code
β€
sure
@prime elm dont get me wrong, you can ping me as much as you like as long as its not spam, however regarding this topic I will very unliekly be of great use for you
@earnest prawn I get that. I was just told to. if you know someone who could help with this id apprecaite. I know how to find a centroid using pixel averages, but it doesnt solve what im aiming to do next, so i need helping using the vertexs of the polygon to find the center point
So I found the data is stored in the variable like this
[[892 563]]
....
[[896 577]]
[[897 576]]]```
with the left colummn denoting x coordinate
and right denoting y
how would i extract them and put them into two seperate lists?
so
well
i'll just say that keras is very amazing framework
it's like python but for deep learning
it makes things very simple but yet flexible
i was able to create my own network thanks to keras even when being dumb and not knowing math at all
so yeah
that's an amazing lib
though i find data preprocessing quite hard
anyone got good libs to simplify that except opencv for images?
@prisma verge is that data type i posted above an array?
is it a 2 column array or one coloumn
have no idea, sorry
@prime elm explain
it's a 3d array to be exact
because it has a list inside a list inside a list
so 3 lists
3d
you need in list less
@prime elm can you post the entire code or at least how you imported the data
@olive willow sure
# # # Start Up
#_______________________________________________________________________________________________________________________
### Library
import matplotlib as plt
import numpy as np
import operator
import cv2
# Select Image
imageName = '04.24-13.26.jpg'
img = cv2.imread(imageName, 1)
# # # 1st Section of Code - Image Processing
#_______________________________________________________________________________________________________________________
# Grid Overlay
cv2.rectangle(img, (130, 25), (925, 340), (255, 255, 255), 1)
#Do the Processing
# Color Filtering
# HSV - Hue, Sat, Value
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
lower_green = np.array([13, 0, 0])
upper_green = np.array([90, 225, 254])
mask = cv2.inRange(hsv, lower_green, upper_green)
res = cv2.bitwise_and(img, img, mask= mask)
# Morphology
#laplacian
laplacian = cv2.Laplacian(mask, cv2.CV_64F)
# Show the Image
# cv2.imshow('laplacian', laplacian)
# cv2.imshow('mask', mask)
# cv2.imshow('res', res)
cv2.imshow('Array of Plant Pots', img)
# # # 2nd Section of Code - Contouring and Blob Detection
#_______________________________________________________________________________________________________________________
# Contouring and Plant Detection
cv2.imwrite('saved_mask.jpg', mask)
reuploadedImage = 'saved_mask.jpg'
preBlobD = cv2.imread(reuploadedImage, 1)
# cv2.imshow("Mask", preBlobD)
font = cv2.FONT_HERSHEY_COMPLEX
start = 1
for i in range(48):
lower_value = np.array([0, 0, 0])
upper_value = np.array([180, 255, 255])
blobDetection = cv2.inRange(preBlobD, lower_value, upper_value)
# Contours detection
contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours:
partLabel = "{}".format(start)
area = cv2.contourArea(cnt)
approx = cv2.approxPolyDP(cnt, 0.02 * cv2.arcLength(cnt, True), True)
x = approx.ravel()[0]
y = approx.ravel()[1]
if area > 80:
cv2.drawContours(preBlobD, [approx], 0, (0, 0, 0), 1)
if start >= 49:
break
#Try to find the Center Here
if len(approx) > 2:
cv2.putText(preBlobD, partLabel, (x, y), font, 1, (255, 255, 255))
start = start + 1
cv2.imshow("Mask", preBlobD)```
# # # 3rd Section of Code - Pixel Count and Surface Area of Each Plant
#_______________________________________________________________________________________________________________________
# Pixel Counter 0.2
# Recursion for Reading Masks
iteration = 0
for i in range(8):
# Moving Frame Over Horizontally
small_x = 150 + (92*i) # x_value refers to moving the frame left to right as columns
big_x = 250 + (92*i)
for j in range(6):
# Moving Frame Over Vertically
small_y = 35 + (101*j) # upper y_value has smaller numerical value
big_y = 151 + (101*j)# lower y_value has bigger numerical value
# Attempting to Output Grid as Multiple Windows
fileString = 'saved_mask{}.jpg'.format(iteration)
cv2.imwrite(fileString, mask[small_y:big_y, small_x:big_x])
gridOutput = cv2.imread(fileString, 1)
gridPart = 'Grid Part #{}'.format(iteration+1)
cv2.imshow(gridPart, gridOutput)
iteration = iteration + 1
x = []
y = []
px = 0
gridImg = gridOutput.astype('float')
gridImg = gridImg[:,:,0] # convert to 2D array
row, col = gridImg.shape
for i in range(row):
for j in range(col):
if gridImg[i,j] == 255:
x.append(j) # get x indices
y.append(i) # get y indices
px = px + 1
# Measuremets
# px : mm^2 :: 1 : (110/93)^2
print("\nGrid Part #%i\n----------------\nX values of identified pixels: %s\nY values of identified pixels: %s\n\nNumber of Pixels(s):%i\nSurface Area (1 px : ((11/9)^2) mm^2): %i" % (iteration, str(x), str(y), px, px * (110/93)^2))
# # Close and Exit
# cv2.waitKey(0)
# cv2.destroyAllWindows()
what's the problem
whats a contour
the contours are built in the for loop
a contour is a blob isolating part of open cv
and it basically wraps around shapes
sure so when a color changes you want the x and y of that
Iknow
u need to draw lines to each vertex. and i want those vertexs
bc using that i can find the center of the shape
ooohh then I can't help you, I'm too noob for that
lol thanks for tryin
sure np sorry maybe nix
oof. idk if any of the admins know, I hope i figure out this problem
yh
the data above is in an array and im not good with thinking like a programmer fully yet. so idk how to extract only the x values from that array
and only the y values
oo
so you save the coordinates as an 2d numpy array?
errr let me paste
nope thats for pixel counting
all sections of code work
that section is inaccurate
bc what it does is
only not the dimension in the array?
because the code you've send before was 3d
you are somewhere adding another layer I think
the 3d array i think u said is the one im trying to extract in for from
im not adding in the code
um look for the variable called "approx"
approx = cv2.approxPolyDP(cnt, 0.02 * cv2.arcLength(cnt, True), True)
can you print it
ye
[[[897 568]]
[[892 563]]
[[891 563]]
[[890 562]]
[[878 562]]
...
[[895 579]]
[[896 578]]
[[896 577]]
[[897 576]]]
yes
its part of the library
and then print it out
aight
lets see
still the same
[[[897 568]]
[[892 563]]
[[891 563]]
[[890 562]]
...
etc
haha π
do print(coord_array[0])
and change it to, [0, 0]
just coord_array[0, 0]
hhaha
now [0, 0, 0]
u think it a 2d array thats just in a 3d?
897
so its just a 2d array as a 3d
no look
and all i got to do is make a for loop to pile it into a list
haha true. but only 2 sets of data
those are 1d
kinda
but bsaically it is missing one "visual" deminsion is what im saying
kinda
like i can hold one of the deminsions constant
yes
and us a for loop
because nothings there
yh
how does the len() function work on arrays?
idk try it
rip says 85
time to count
actually word counter.com
170 words
/2
85 lines
so its counts the group, not the number of elements
yh 2 numbers
hm?
if you want the count
len(str(coord_array))
yh
is it printing it out
it says 9
idk dude sorry have to go
npnp
bye ask nix
thanks for ur help
Hey, there is any easy way to remove matplotlib axis while keeping it's grids ?
I'm not sure but there has to be
@rancid gust
oohh yh that makes sense
@sand reef thanks!
Np!
im trying to curve fit every point, anyone know how to do it?
essentially, i just want connect all the points up, but make the lines smoother
i could do this easily by hand, so it cant be very complicated
if it helps, the points will always increase in size
you just put all the points in an x/y array ant plt.plot(x,y) them
if you want a function which fits all the points I am sure you can find one with a lot of work
this doesnt look it the graph fits the point
it looks like the points fit the graph
esepcially with the last one
It's just polynomial interpolation
ok interesting. that lead me to this article:
https://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html
seems to be exactly what i need ty
any tips for reading multiple .xlsx files in a folder with column headers that are slightly off?
Assuming that I'm not allowed to modify the source files,
I tried something like df = pd.read_excel(source location + filename,
usecols = lambda x: x.lower() in usecols_lower_list) but it doesn't seem to work
then concat it together
so, are you only limited to using pandas? or are willing to try other libraries?
because, you could try this, I hope it helps
@knotty nexus
thanks @sand reef , I just skimmed the chapter, it doesn't really seem to contain specific information on handling files with different column headers, but it did give me the idea to just skip the header rows when reading in the files, then create header names afterwards. I think in my case the columns in the different files remain in the same order, so this would also work
np!
I wonder if anyone could suggest me something to start in reinforcement learning. I have covered mostly its mathematical part and yeah some basics like GYM etc to implement Q learning.
I am looking for something solid to start.
beginner here
Well, I too have to begin reinforcement learning myself, I could myself use some pointers where to begin its mathematics from, got any leads for me? All I have done is the regular machine learning and deep learning courses from coursera.
This could be the best tutorial to start
It basically covers all the prerequisites required for RL
like policy Grad,Actor Critic all basic information you need to start
Apart from that as above mentioned you can implement basic Q learning(You'll see in the tutorial) using GYM it is a toolkit for developing and comparing reinforcement learning algorithms.
I need someone to guide me from here...lol
Tysm!
Does anybody know how to apply PCA to data with a large amount of features (170000+)? I am currently using sklearn but my computer crashes when I try to obtain a cumulative explained variance curve to determine an optimal number of components.
No need to use a shallower CV based course to learn RL
Try Sutton's book, Berkeley's course and Spinning up RL
What exactly does Spinning up RL means
Hi all, is there anyone who has experience with XGBoost? I'm trying to train a model which has X-value as a feature, but for pretty much every method the error goes:
self._features_count = X.shape[1]
IndexError: tuple index out of range
Now, this makse sense since it's a single feature. So when I try to reshape my X_train value(e.g. using .reshape(1,-1), the following error occurs:
ValueError: setting an array element with a sequence.
Doe anyone knows whats wrong?