#data-science-and-ml | Python | Page 262

heady hatch Oct 21, 2020, 7:33 PM

#

Because that's how society uses facial recognition.

#

Terrible for certain groups of people but okay for others.

#

Some suggestions I have for dealing with certain groups of data points is you can either do feature engineering

#

ensemble the models

lapis sequoia Oct 21, 2020, 7:34 PM

#

yeah good point

heady hatch Oct 21, 2020, 7:34 PM

#

etc etc.

lapis sequoia Oct 21, 2020, 7:34 PM

#

whats that

heady hatch Oct 21, 2020, 7:34 PM

#

You might need to look into data science.

lapis sequoia Oct 21, 2020, 9:25 PM

#

@heady hatch any helpful resources for that?

steep olive Oct 21, 2020, 11:24 PM

#

Hey, I'm learning pandas and numpy... Can somebody tell me, How can I divide a dataframe in series?

heady hatch Oct 21, 2020, 11:27 PM

#

Could you clarify what do you mean by divide a dataframe in a series?

steep olive Oct 21, 2020, 11:30 PM

#

Get specific information form a dataframe by labels

#

Like "dates" and get all the column

lapis sequoia Oct 21, 2020, 11:40 PM

#

What are some good books to learn stats

heady hatch Oct 22, 2020, 12:00 AM

#

@lapis sequoia I've heard people like this book. https://web.stanford.edu/~hastie/ElemStatLearn/

#

@steep olive So get certain columns?

ie

if there's a column named 'a', get 'a' as a column from the dataframe?

steep olive Oct 22, 2020, 12:03 AM

#

Yeah, but I've found what I need, thank you XD

hollow sentinel Oct 22, 2020, 12:07 AM

#

hey guys

#

what's the difference between linear regression and multiple linear regression

#

can someone explain it simply

lapis sequoia Oct 22, 2020, 12:18 AM

#

Most of the time what we( in ML) use is multiple linear regression only.
When you have more than one independent variable then it becomes MLP.

#

Y = Model(X_1) Linear Regression
Y = Model(X_1, X_2, X_3) Multiple Linear Regression

hollow sentinel Oct 22, 2020, 12:19 AM

#

very cool

#

thank you @lapis sequoia

full narwhal Oct 22, 2020, 12:58 AM

#

I was looking at different ways of doing array filtering in Python, and came across something I find weird. Why is the second method of filtering the fastest?

📎 unknown.png

lapis sequoia Oct 22, 2020, 1:11 AM

#

@full narwhal Actually the difference is not a lot (at max 20ish %).
Anyway the difference is most likely due to how indexing works in the background for numpy arrays.

In 1st and 3rd you need to calculate columns and rows indexes k=i*ncol+j for each cell.
But in the 2nd it is you are avoiding that computation. Therefore it is a lil faster.

#

I'm not sure if I'm accounting all the possible reasons but the above one is one of them.

full narwhal Oct 22, 2020, 1:13 AM

#

@lapis sequoia The difference isn't a lot, but the order is consistent across multiple array sizes. Naively, though, doesn't the second method have the most amount of allocations?

#

There's one for the data[1] >= 0.75, then one for np.where(), then one for data[0][index_list]

#

and i feel like there should be a way to combine what np.where is doing in the other two methods without that extra allocation

#

am i missing something?

lapis sequoia Oct 22, 2020, 1:17 AM

#

https://stackoverflow.com/questions/55123613/why-numpy-where-is-much-faster-than-alternatives
If indexing is not a problem then its C implementation which is the reason of speed.

Stack Overflow

Why numpy.where is much faster than alternatives

im trying to speedup the following code:

import time
import numpy as np
np.random.seed(10)
b=np.random.rand(10000,1000)
def f(a=1):
tott=0
for _ in range(a):
q=np.array(b)
...

full narwhal Oct 22, 2020, 1:18 AM

#

that code is comparing python performance to numpy perf. i dont really see what it has to do with this

lapis sequoia Oct 22, 2020, 1:20 AM

#

%%timeit
index_list = [data[1] >= 0.75]

%%timeit
index_list = np.where(data[1] >= 0.75)

full narwhal Oct 22, 2020, 1:21 AM

#

📎 unknown.png

lapis sequoia Oct 22, 2020, 1:21 AM

#

try to break the code in more cells and time it. Then see if it is due numpys C optimisation or not.

lapis sequoia Oct 22, 2020, 1:37 AM

#

@full narwhal

📎 Screenshot_2020-10-22_at_7.07.26_AM.png

#

As you can see the np.where is slower but the output is in indices form where as in simple conditional it is in True and False. Which leads to different sizes.

full narwhal Oct 22, 2020, 1:40 AM

#

Yes, but that computation has to happen either way

#

I would argue np.where has to do the extra step of gathering the indices

#

The way i see it, why doesn't the simple solution do what np.where does, except rather than gathering the indices, it gathers the associated data[0][i] (which should be an O(1) operation)

lapis sequoia Oct 22, 2020, 1:42 AM

#

If you want to go indepth on why it is happening like that and why is there a difference then I can only suggest to look under the hood.

velvet thorn Oct 22, 2020, 2:40 AM

#

hm.

#

this is an interesting problem

#

@full narwhal I don't have an answer, only a guess

#

and my guess is that in the np.where case, it's faster because the size of the result is known

#

so there is only ever one allocation

#

I'm not sure if there's a way to track allocations, but if there is that might help?

full narwhal Oct 22, 2020, 2:48 AM

#

@velvet thorn The size of the result is know because np.where had to figure out the size for its return array

#

and you still have to allocate another array for the result, no?

#

it's not overwriting the index array

velvet thorn Oct 22, 2020, 2:49 AM

#

and you still have to allocate another array for the result, no?
@full narwhal yes

#

but what I'm saying is

#

the difference is in the column-level indexer for the original array, right?

#

whether it's a boolean mask or an array of indices

#

and I'm saying that that part is faster with the latter because the length of the result is known in that case

#

but not in the former case

#

since you have to traverse the entire boolean mask to know the length of the result

#

so presumably there's a number of reallocations, which lead to the slightly higher time

daring crag Oct 22, 2020, 2:52 AM

#

Hello there! Im new at data science and i want to start but i dont know from where... Can someone recommend me a course, Tutorial or any resource? Thanks btw

velvet thorn Oct 22, 2020, 2:52 AM

#

non-conclusive support:

#

if you reduce the size of the original array by a lot, the former is actually faster

#

which, is my supposition were valid, would make sense, because there'd be fewer reallocations

full narwhal Oct 22, 2020, 2:53 AM

#

@velvet thorn what i'm saying is np.where doesn't know what size the output array is to begin with, either

#

right?

velvet thorn Oct 22, 2020, 2:53 AM

#

@velvet thorn what i'm saying is np.where doesn't know what size the output array is to begin with, either
@full narwhal yes, it doesn't

#

but it calculates it

#

that's my point

full narwhal Oct 22, 2020, 2:54 AM

#

what i'm asking is why couldn't we skip the intermediate step of collecting the indices?

velvet thorn Oct 22, 2020, 2:54 AM

#

The way i see it, why doesn't the simple solution do what np.where does, except rather than gathering the indices, it gathers the associated data[0][i] (which should be an O(1) operation)
@full narwhal because it's not necessarily faster, perhaps

#

what i'm asking is why couldn't we skip the intermediate step of collecting the indices?
@full narwhal there's probably some sort of tradeoff.

full narwhal Oct 22, 2020, 2:56 AM

#

so i just tested it on a 2x1000 array and you seem to be right, but at 2x10000 it seems to flip the other way around (and i wouldnt really consider 10000 to be large)

#

but i still feel like there's something missing here; i just don't see how the np.where method can be faster for larger arrays when it has to do an extra step

#

and @daring crag https://jakevdp.github.io/PythonDataScienceHandbook/ is a good start

Python Data Science Handbook | Python Data Science Handbook

velvet thorn Oct 22, 2020, 2:58 AM

#

but i still feel like there's something missing here; i just don't see how the np.where method can be faster for larger arrays when it has to do an extra step
@full narwhal like I said, presumably the different indexing format cuts down on reallocations when applied to the original array

#

but without looking at the source it'd be hard to tell I guess

ruby summit Oct 22, 2020, 3:15 AM

#

Hello everyone. What do you think is the appropriate mixture of data science skills and domain knowledge?

jolly plank Oct 22, 2020, 3:21 AM

#

Can somebody help me with question

#

here is the question

📎 unknown.png

#

hello...

lone osprey Oct 22, 2020, 3:43 AM

#

Filter the subset.. does anyone understand the question??

deep galleon Oct 22, 2020, 4:08 AM

#

I'll be that guy.. if someone is knowledgeable with numpy masked array, I posted a question with a simplified example in #help-pancakes 🙂

mild topaz Oct 22, 2020, 6:18 AM

#

my code here https://paste.pythondiscord.com/ezojitenom.py python Traceback (most recent call last): File "C:\Users\Admin\anaconda3\lib\site-packages\flask\app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "C:\Users\Admin\anaconda3\lib\site-packages\flask\app.py", line 1935, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "C:\Users\Admin\anaconda3\lib\site-packages\flask_restful\__init__.py", line 468, in wrapper resp = resource(*args, **kwargs) File "C:\Users\Admin\anaconda3\lib\site-packages\flask\views.py", line 89, in view return self.dispatch_request(*args, **kwargs) File "C:\Users\Admin\anaconda3\lib\site-packages\flask_restful\__init__.py", line 583, in dispatch_request resp = meth(*args, **kwargs) File "E:\demo3\findDocumentType1.py", line 126, in post self.resize_im(image_data) File "E:\demo3\findDocumentType1.py", line 209, in resize_im img = preprocessing(img) File "E:\demo3\findDocumentType1.py", line 194, in preprocessing img = img//255 TypeError: unsupported operand type(s) for //: 'NoneType' and 'int'

vague bear Oct 22, 2020, 6:50 AM

#

in real jobs, how common is it to code the visualization with Python matplotlib as oppose to tools like Tableua?

odd yoke Oct 22, 2020, 7:27 AM

#

Ime it depends on the team and the tools they use, if they're already all the preprocessing and analysis in python, it's generally simpler do directly plot stuff from there with mpl/pyplot

#

Same with r and ggplot etc

sudden delta Oct 22, 2020, 8:50 AM

#

just used matplotlib for realtime visualization recently, it happened to be really quick to get going

#

short answer: matplotlib if engineers/scientists are looking at it, tableau if managers and up are looking at it :>

hard swan Oct 22, 2020, 11:23 AM

#

Is this room also for Machine Learning?

pale thunder Oct 22, 2020, 11:23 AM

#

yes

hard swan Oct 22, 2020, 11:23 AM

#

I am starting to learn ML and I would be asking many questions about that

#

I have the book Python Machine Learning 3rd Edition by Sebastian Raschka

spark nimbus Oct 22, 2020, 12:13 PM

#

Does anyone have good references for signal processing?

mild topaz Oct 22, 2020, 12:40 PM

#

model = load_model(pathlib.Path(r'E://demo3\\united_kingdom_50.h5')) this is not working

grave frost Oct 22, 2020, 1:04 PM

#

@daring crag What exactly do you find interesting?

lapis sequoia Oct 22, 2020, 1:17 PM

#

Can somebody help me with question
@jolly plank just apply a filter or condition to select the desired category.

#

in real jobs, how common is it to code the visualization with Python matplotlib as oppose to tools like Tableau?
@vague bear It'll depend on your job. If your jobs requires you to give a final viz. report then maybe you don't want to use matplotlib. But when you are doing analysis as subtask in a project or you just need quick plots then your may go for matplotlib.

Conclusion: If you are serving the final result to a non technical group or its a presentation then you may want to have a nice dashboard made from tableau.

#

@spark nimbus
You can look into this course from Coursera.
https://www.coursera.org/learn/advanced-machine-learning-signal-processing

Coursera

Advanced Machine Learning and Signal Processing

Offered by IBM. >>> By enrolling in this course you agree to the End User License Agreement as set out in the FAQ. Once enrolled you can access the license in the Resources area <<< This course, Advanced Machine Learning and Signal Processing, is part of the IBM Advanced Dat...

spark nimbus Oct 22, 2020, 1:44 PM

#

@lapis sequoia is that one mainly focused around machine learning or mainly signal processing, because I don't need the former

#

nvm, seems to mostly be machine learning and the applications of it in signal processing

lapis sequoia Oct 22, 2020, 1:46 PM

#

yeah i though you are looking with ML. This is data science channel. tbh.

#

@spark nimbus You can try this. This looks pure signal Processing. https://www.coursera.org/specializations/digital-signal-processing

Coursera

Digital Signal Processing

Offered by École Polytechnique Fédérale de Lausanne. This Specialization provides a full course in Digital Signal Processing, with a focus on audio processing and data transmission. You will start from the basic concepts of discrete-time signals and proceed to learn how to ana...

#

But it looks paid. I'm not able to find audit option in Coursera.

spark nimbus Oct 22, 2020, 1:49 PM

#

Yeah, I was about to say it requires a login :/

lapis sequoia Oct 22, 2020, 1:49 PM

#

You can try free trial..

#

There was an audit option in coursera courses. You could watch free videos. But now it looks like they have removed that option.

spark nimbus Oct 22, 2020, 1:52 PM

#

The main issue in signal processing is that the basic concepts are pretty simple, but for anything slightly more complex you need to suddenly understand a couple dozen terms that you've likely never heard before, and I just keep getting lost in all this, especially since I have a hard time visualizing it in my head

#

To give an example:

📎 Screenshot_20201022_155541.png

lapis sequoia Oct 22, 2020, 1:58 PM

#

Well I studied that in my university.
You can also try this:https://nptel.ac.in/courses/117/102/117102060/#
Increase the speed and watch. If you have know some basics then this can help.

Also you can look for OCW MIT Lectures. They are pretty old but still make sense.

NPTEL :: Electronics & Communication Engineering - Digital Signal P...

NPTEL provides E-learning through online Web and Video courses various streams.

spark nimbus Oct 22, 2020, 2:02 PM

#

oh now that you mention it, I might still have the PDFs of the books in my uni's mega drive

#

I think they had a signal processing course

austere swift Oct 22, 2020, 2:28 PM

#

For some reason i'm getting a keyerror when trying to read a column from a dataframe in pandas, when I know the column name is correct

lapis sequoia Oct 22, 2020, 2:29 PM

#

df.columns will give you columns.
print(df.columns) and see the column names

austere swift Oct 22, 2020, 2:29 PM

#

here it shows the dataframe and the column lists but it still says keyerror trade type

📎 unknown.png

#

yeah i already did df.columns

#

you can see 'Trade Type' is in the dataframe and in df.columns but it still gives me a keyerror

lapis sequoia Oct 22, 2020, 2:30 PM

#

check for whitespaces and \n

lone osprey Oct 22, 2020, 2:32 PM

#

Try checking data in database

#

Like pasta told

#

U like pasta, pasta?

#

Or ur name is pasta?

austere swift Oct 22, 2020, 2:34 PM

#

what's weird is that when I try to call 'Trade Date' it works fine, but 'Trade Type' doesnt work

lone osprey Oct 22, 2020, 2:35 PM

#

I think u have to check on data only

austere swift Oct 22, 2020, 2:35 PM

#

what do you mean

lapis sequoia Oct 22, 2020, 2:35 PM

#

@lone osprey I keep my nicks related to food and fruits. And yes i like pasta.

lone osprey Oct 22, 2020, 2:36 PM

#

Nice😁

spark nimbus Oct 22, 2020, 2:36 PM

#

Fortran
Yeah, this is a 1999 book alright

📎 Screenshot_20201022_163547.png

lone osprey Oct 22, 2020, 2:36 PM

#

what do you mean
@austere swift check like pasta told

lapis sequoia Oct 22, 2020, 2:36 PM

#

what's weird is that when I try to call 'Trade Date' it works fine, but 'Trade Type' doesnt work
@austere swift did you check for whitespaces and \n ?

austere swift Oct 22, 2020, 2:36 PM

#

how would I check that?

lone osprey Oct 22, 2020, 2:37 PM

#

Show us data once

lapis sequoia Oct 22, 2020, 2:37 PM

#

df[df.columns[6]]

#

see where the Trade Type is in the columns array. And select it.

#

most likely it will be 6th index. if it works then the there is some whitespace

austere swift Oct 22, 2020, 2:39 PM

#

df[df.columns[6]]
@lapis sequoia this works for some reason but putting the string directly doesnt

#

maybe it has some weird whitespace thats not a normal space

#

but even when i copy it from the terminal it doesn't work

lapis sequoia Oct 22, 2020, 2:39 PM

#

You can't see space when you print it.

solar bluff Oct 22, 2020, 2:40 PM

#

Yeah I've had df columns do that, when there was a space included at the end of the name that isn't obvious

austere swift Oct 22, 2020, 2:40 PM

#

yeah, I guess i'll just use df[df.columns[6]] instead just as a workaround

solar bluff Oct 22, 2020, 2:41 PM

#

you could also just rename that column?

austere swift Oct 22, 2020, 2:41 PM

#

well it's being scraped from a website, so it would probably just be easier to use that as a workaround

lapis sequoia Oct 22, 2020, 2:42 PM

#

@austere swift you can clean those spaces by using string.strip()

#

df.columns = [col.strip() for col in df.columns]

austere swift Oct 22, 2020, 2:44 PM

#

Yeah i'll try that

#

nope that didnt work either

haughty nymph Oct 22, 2020, 2:56 PM

#

Hey folks, does anyone know any good and robust ways to convert a pretty extensive MATLAB script to a Python script?

lapis sequoia Oct 22, 2020, 2:57 PM

#

i got a question

#

Hey folks, does anyone know any good and robust ways to convert a pretty extensive MATLAB script to a Python script?
@haughty nymph You'll have to write it down in python. Or you can see if there is some library or some repo where the required script is already written.

#

i got a series of 3d brainscans with their labels

#

how can i extract them with the correct labels

#

its in matlab file

#

what is the format of 3d brainscans ? images?
And where are the labels? In filename ?

narrow flume Oct 22, 2020, 3:22 PM

#

Have the user input a list of columns for a table

Have the user input a data type for every column: int, float, string size 255

In a loop have the user input the values for each row

Ensure the string size doesn't exceed 255

Print the results when the user is finished

#

for the second sentence, how do you know when the user has already input every column include int , float , string size 255?

lapis condor Oct 22, 2020, 3:25 PM

#

does anyone know how to implement naive baye's classification algorithm? I understood how it works but I'm new to Python language.

lapis sequoia Oct 22, 2020, 3:28 PM

#

@lapis sequoia the labels are in an array

#

@lapis sequoia the labels are in an array
@lapis sequoia I'm not sure what is your problem.
Do you want to train a image classifier to classify the brainscans with correct labels ?

#

does anyone know how to implement naive baye's classification algorithm? I understood how it works but I'm new to Python language.
@lapis condor If you are looking for basic naive Bayes Classification algorithm implementation in python then they are very much available on internet.

You just have to create a table of probability(frequency/total) for each word by each class.

#

And if the variables are numbers (decimals) then it is a lil tricky.

lapis condor Oct 22, 2020, 3:37 PM

#

Actually, I'm looking for something that doesn't use iris. I got specific dataset and was asked not to import

#

@lapis sequoia If you could help with that please

lapis sequoia Oct 22, 2020, 3:42 PM

#

@lapis sequoia the problem is the I have never used matlab files

#

And I need a way to extract several scans and their corresponding to train a model

#

How I'm not sure how to do so

#

@lapis sequoia
can you tell me the extension of the file ?

#

in which the brainscan image is stored

narrow flume Oct 22, 2020, 3:50 PM

#

Could anyone help me with python numpy structure array?

lone osprey Oct 22, 2020, 3:53 PM

#

Yup

narrow flume Oct 22, 2020, 3:55 PM

#

Have the user input a list of columns for a table

Have the user input a data type for every column: int, float, string size 255

In a loop have the user input the values for each row

Ensure the string size doesn't exceed 255

Print the results when the user is finished

You may use Numpy or Pandas to do this

#

I choose to use numpy

#

import numpy as np a = int(input("Size of array:")) lst = [] for i in range(a): my_array.append((input("Values:"))) my_array = np.array(my_array)

#

here's what i have been doing

#

how do i know when user has already input a data type of every colum: int , float , string 255

#

@lone osprey

lapis sequoia Oct 22, 2020, 4:22 PM

#

@lapis sequoia
can you tell me the extension of the file ?
@lapis sequoia well its a .mat file

#

i believe its several thousand 3d images

#

import scipy.io
X = scipy.io.loadmat('file.mat')

#

Hello guys, does someone here attempted using neuronal networks for building better trading bots?

#

@lapis sequoia
Images are nothing but arrays. Just load them as shown above. Check the shape of X.
You X will have some shape like(n,h,w, ch).

#

Where n = number of images.
h = height of brainscan images
w = width of brainscan images
ch = channels. (=3 if its a clor image)

#

Apply CNN on them and you should be able to get a decent classifier.
Or Use transfer learning if the images are not enough.

#

its telling me that its a dictionary

#

📎 unknown.png

#

@lapis sequoia Ok. then you will just have to extract the value where from the some Key.
Most likely its in the last tuple. ('Data', array[])

#

how would i do that

#

Yeah its the key with 'Data'.
data = scipy.io.loadmat('file.mat')
X = data['data']

#

And you can see its a 4D array as I mentioned above. (n,h,w, ch)
check the shape of X.shape

#

📎 unknown.png

#

@lapis sequoia could this be the dimensions

#

i suspect its a greyscale image

#

Yes. its greyscale. But I'm not sure which one is the n = Number of brainscan files.

#

most likey n=89 and the brainscan images are of 176*176.

#

so i got 89 brains scans

#

pixels is 176 by 176

#

channel is one because of brain scan

#

👍

#

thank you so much

#

Now you should be able to create a classifier. Try using Transfer learning as you have only 89 images.

#

i got one more question

#

i basically have diagnose a condition which a yes or no value

#

should the labels be one hot encoded?

#

so i went back and looked at the Labels which are labeled "Target

#

and i got this result

#

📎 unknown.png

#

the 89 responding to how many images i have

#

and 1 is binary

#

would that be a correct explanation

lone osprey Oct 22, 2020, 4:49 PM

#

how do i know when user has already input a data type of every colum: int , float , string 255
@narrow flume u want to know if input is int or float or string?

lapis sequoia Oct 22, 2020, 4:49 PM

#

most of the libraries and packages takes care of this. Automatically.
Also whether you should use OHE or binary (1/0) will depend on your loss function, you can use binary-cross-entropy or logloss.

#

@lapis sequoia https://stackoverflow.com/questions/50913508/what-is-the-difference-between-cross-entropy-and-log-loss-error#:~:text=Log loss and cross-entropy,resolve to the same thing.&text=Cross-entropy loss%2C or log,value between 0 and 1.

Stack Overflow

What is the difference between cross-entropy and log loss error?

What is the difference between cross-entropy and log loss error? The formulae for both seem to be very similar.

#

thank you

narrow flume Oct 22, 2020, 4:59 PM

#

can anyone help me with structure array

earnest forge Oct 22, 2020, 5:08 PM

#

lol. I was about to ask the same

#

what are you triying to do, though?

lapis sequoia Oct 22, 2020, 5:29 PM

#

Have the user input a list of columns for a table

Have the user input a data type for every column: int, float, string size 255

In a loop have the user input the values for each row

Ensure the string size doesn't exceed 255

Print the results when the user is finished

You may use Numpy or Pandas to do this
@narrow flume First get the columns names as inputs from user.
Then get columns types as inputs.
After you have this let user input the values of each row.

lapis sequoia Oct 22, 2020, 5:47 PM

#

@narrow flume

📎 Screenshot_2020-10-22_at_11.17.34_PM.png

#

You have to something like that.
I'm not completing the code. You can do the validation for string with length max allowed (255) and check for types.

#

You can do the above with numpy also.

narrow flume Oct 22, 2020, 5:57 PM

#

oh so we have to ask user for the data type of their input every time?

#

what is the size of string 255 ? @lapis sequoia

lapis sequoia Oct 22, 2020, 5:59 PM

#

oh so we have to ask user for the data type of their input every time?
@narrow flume No we have to ask the type of column. And make user to enter that type.

#

what is the size of string 255 ? @lapis sequoia
@narrow flume size and len are two different function in python.
Check the question if you have to get size or len.
And you have to check if user is entering it correctly or not. If not then maybe discard that input or again ask user to fill in that row.
Check you question on what to do.
If its not mentioned then you can decide on your own.

narrow flume Oct 22, 2020, 6:03 PM

#

then i will make a length function to check

fading dirge Oct 22, 2020, 6:03 PM

#

i would like to find the 4x4 transformation matrix that best fits one 3d set of points to another 3d set of points
can i do that with scikit learn and what functions should i start looking at to accomplish that?

#

i think i could use their stochastic gradient descent module, but is there a better way that i just dont know about?

nocturne kraken Oct 22, 2020, 6:07 PM

#

least squares

#

you're basically trying to solve a least squares problem XA = Y

#

and there's a solution to that which is just the pseudoinverse

fading dirge Oct 22, 2020, 6:09 PM

#

cool, so it looks like the least squares in scikit fits a line to a set of points, is there an easy way to get it to learn a matrix? sorry i'm very new to DS im basically a full stack dev who got thrown onto some ds projects hahaha

weary heart Oct 22, 2020, 6:10 PM

#

hi, if i got r2 score on train data 0.96 and test data 0.90 is it still count as overfitting?
and if so, how i to handle it? should i change the max depth and gamma? (i'm using hyper parameter tuning xg boost)

bold olive Oct 22, 2020, 6:14 PM

#

How do you access the X_train, y_train, X_test, y_test after doing doing a KFold like this: cv = StratifiedKFold(n_splits=10, random_state=42, shuffle=True)?

I want to now fit the data like this:

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
lda = LinearDiscriminantAnalysis()
lda.fit(X_train,y_train)
y_pred = lda.predict(X_test)```

and get the mean scores, mean ROC, etc.

scenic hollow Oct 22, 2020, 6:16 PM

#

what does tf.compat.v1.get_default_graph means? Like what is computational graph?

bold olive Oct 22, 2020, 6:19 PM

#

This I know, using the _train_test_split function, you can get the indices like this:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)```

#

But how to do the same for KFold?

lapis sequoia Oct 22, 2020, 6:42 PM

#

@bold olive
Check the documentation here. https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedKFold.html

You will get K different train test split. And you can train and test on each of them then aggregate your results.

bold olive Oct 22, 2020, 7:18 PM

#

So basically fit the classifier in the for loop:

for train_index, test_index in cv.split(X, y):
    X_train, X_test = X.iloc[train_index], X.iloc[test_index]
    y_train, y_test = y.iloc[train_index], y.iloc[test_index]
    clf = lda.fit(X_train,y_train)```

And then calculate mean scores and AUCs, correct?

hollow sentinel Oct 22, 2020, 7:30 PM

#

hey . guys

#

can someone explain what pandas.to_datetime means?

#

I've been seeing it pop up in a lot of Kaggle notebooks and I don't understand what it does

#

I've looked at the pandas doc too

calm forge Oct 22, 2020, 7:48 PM

#

yeah all i would do is share the documentation with you to read and go over carefully : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html

hollow sentinel Oct 22, 2020, 7:56 PM

#

I think I won't figure it out until I do it in a project

#

or download a dataset and use datetime on it

heady hatch Oct 22, 2020, 8:15 PM

#

Hey guys quick question on TFRecord format.

When is it worthwhile to use?

indigo steppe Oct 22, 2020, 8:45 PM

#

if you understand the basics of python (basic functions,if statements,loops...),how hard is it to grasp ml with scikit-learn?

hard swan Oct 22, 2020, 9:00 PM

#

I need help with adaline

#

I dont understand it

#

so how does it actually work?

#

and I dont really get weight and cost in ML

lapis sequoia Oct 22, 2020, 9:42 PM

#

So basically fit the classifier in the for loop:

for train_index, test_index in cv.split(X, y):
    X_train, X_test = X.iloc[train_index], X.iloc[test_index]
    y_train, y_test = y.iloc[train_index], y.iloc[test_index]
    clf = lda.fit(X_train,y_train)```

And then calculate mean scores and AUCs, correct?

@bold olive Yes that is the idea. Just take the mean of all the metrics you want to have.

bold olive Oct 22, 2020, 10:09 PM

#

It worked btw, @lapis sequoia. Managed to get the confusion matrix, mean accuracy and ROC curves of all folds with mean AUC.

velvet thorn Oct 22, 2020, 10:23 PM

#

can someone explain what pandas.to_datetime means?
@hollow sentinel it just converts a value or number of values to the datetime type

#

what don’t you get about it?

shell berry Oct 22, 2020, 10:42 PM

#

Are there any examples of using SVMs for multilabel problems, with a SVM per label perhaps?

odd yoke Oct 22, 2020, 11:00 PM

#

you can do multi label by combining any binary classification model, so yes, it's possible

#

not that it necessarily gives good results

shell berry Oct 22, 2020, 11:01 PM

#

@odd yoke Thanks - do you have any code examples or is there a pipeline in scikit learn for it?

#

I have a multiclass multilabel dataset and decision trees arent performing particularly well

#

Maybe I need more features?

odd yoke Oct 22, 2020, 11:03 PM

#

https://scikit-learn.org/stable/modules/generated/sklearn.multioutput.MultiOutputClassifier.html#sklearn.multioutput.MultiOutputClassifier this looks to be what you want in terms of code, there can be many reasons why the result is not good

shell berry Oct 22, 2020, 11:04 PM

#

Thanks for the link! That looks like exactly what I need

#

I was thinking I had to hardcode it all myself

#

Is it not helpful most of the time @odd yoke ?

#

Do you recommend multilabel classifiers over wrapping a SVM or something into a multioutput?

grave thunder Oct 22, 2020, 11:43 PM

#

Hey lads, I have a pandas question. What's the difference between groupby and sort_values?

patent flame Oct 22, 2020, 11:44 PM

#

group by groups based on a condition

tidal bough Oct 22, 2020, 11:44 PM

#

They return different stuff, and probably also have different algorithmic complexities (groupby requires one pass over the array, so O(n), sorting, as always, is O(n log n))

patent flame Oct 22, 2020, 11:44 PM

#

sort sorts the values based on a condition

#

for example:

#

[1, 2, 3, 4, 5, 6]

#

if u group this by > 4

#

then u get [5, 6]

#

if u sort it from large to small

#

u get

#

[6, 5, 4, 3, 2, 1]

#

@grave thunder

grave thunder Oct 22, 2020, 11:46 PM

#

And if I do groupby("Collumn").max() how is that different then from sorting?

velvet thorn Oct 22, 2020, 11:46 PM

#

that's a totally different operation

grave thunder Oct 22, 2020, 11:46 PM

#

It will also return in order

velvet thorn Oct 22, 2020, 11:46 PM

#

groupby and sorting are not more than tangentially related

patent flame Oct 22, 2020, 11:47 PM

#

instead of being condescending about his question

velvet thorn Oct 22, 2020, 11:47 PM

#

the point of groupby is to split some data into groups and apply an operation to each group independently

patent flame Oct 22, 2020, 11:47 PM

#

you can just answer it, u know

velvet thorn Oct 22, 2020, 11:47 PM

#

you can just answer it, u know
@patent flame relax, I'm answering it

#

creating an example

grave thunder Oct 22, 2020, 11:47 PM

#

Ah I see, and sorting is used more like for printing data

velvet thorn Oct 22, 2020, 11:47 PM

#

not necessarily

#

sorting is used when you want to impose some kind of order on data

grave thunder Oct 22, 2020, 11:48 PM

#

And groupy by when I wanna operate on a group of data

velvet thorn Oct 22, 2020, 11:49 PM

#

>>> df
    fruit  price
0   apple    1.8
1   apple    1.3
2    pear    2.3
3  banana    3.7
4    pear    2.5
5   apple    1.5
6  banana    3.4
>>> df['price'].max()
3.7
>>> df.groupby('fruit')['price'].max()
fruit
apple     1.8
banana    3.7
pear      2.5
Name: price, dtype: float64

#

a quick example

#

if you want to answer the question "which is the most expensive fruit", then you just do df['price'].max() (the max price)

#

but if you want to answer the question "for each fruit, what is the highest price", then you need to do a groupby.

grave thunder Oct 22, 2020, 11:50 PM

#

Ohhhh

velvet thorn Oct 22, 2020, 11:50 PM

#

conceptually, this splits the DataFrame into one "mini-DF" for each value of fruit, so you have one mini-DF where fruit is apple, one for banana and so on

#

then you get the .max() of each

#

then you combine them together.

#

does that make sense?

grave thunder Oct 22, 2020, 11:50 PM

#

yup yup ^_^

velvet thorn Oct 22, 2020, 11:50 PM

#

okay

#

so sort just orders values

#

now, you see the DF above is not ordered in any way

grave thunder Oct 22, 2020, 11:51 PM

#

Thanks lad! I've been trying to wrap my head around that for a while now

velvet thorn Oct 22, 2020, 11:51 PM

#

but I can impose an ordering:

>>> df.sort_values('price')
    fruit  price
1   apple    1.3
5   apple    1.5
0   apple    1.8
2    pear    2.3
4    pear    2.5
6  banana    3.4
3  banana    3.7

#

so now it's ordered by price in ascending order

#

you can just answer it, u know
@patent flame happy now?

grave thunder Oct 22, 2020, 11:52 PM

#

🤝

patent flame Oct 22, 2020, 11:52 PM

#

I'm not the one you should please. Be a better person for yourself not for anyone else.

grave thunder Oct 22, 2020, 11:52 PM

#

Chill, you both helped out. Thanks lads

velvet thorn Oct 22, 2020, 11:52 PM

#

I'm not the one you should please. Be a better person for yourself not for anyone else.
@patent flame that's pretty ironic because you seem rather quick to jump down someone's throat

#

Chill, you both helped out. Thanks lads
@grave thunder yw

patent flame Oct 22, 2020, 11:54 PM

#

i dont think u can have nudity in profile pic

hollow sentinel Oct 22, 2020, 11:54 PM

#

let's calm down bois we're all friends here

velvet thorn Oct 22, 2020, 11:55 PM

#

if you understand the basics of python (basic functions,if statements,loops...),how hard is it to grasp ml with scikit-learn?
@indigo steppe too early; don't do it.

#

you can follow a tutorial, and maybe something will kind of work, but way too soon you will run into problems that are above your level

#

work on your fundamentals (not just programming; mathematics too) for a while first.

grave thunder Oct 22, 2020, 11:56 PM

#

Learn about sigmoid functions for example

velvet thorn Oct 22, 2020, 11:56 PM

#

ML has become a lot more accessible in recent years, but it's still a very complex subject.

hollow sentinel Oct 22, 2020, 11:56 PM

#

@indigo steppe you can try a udemy course that I'm using: Python for Data Science and Machine Learning Bootcamp by Jose Portilla

#

not that all your doubts will be cleared but it's a good start

grave thunder Oct 22, 2020, 11:57 PM

#

Hey, I've gone through that one too! Albeit it's very well done it's not for total beginners

hollow sentinel Oct 22, 2020, 11:58 PM

#

you can also try this udemy course: 2020 complete python bootcamp from zero to hero in python by Jose Portilla, Kaggle mini courses, and Andrew Ng's course

#

I haven't finished the python for DS & ML bootcamp bc of college I'm still on linear regression

grave thunder Oct 23, 2020, 12:02 AM

#

Yup, definitely can recommend Portilla. Among best 20 bucks I've spent.

#

You'll get there. ML is super fun and applicable almost everywhere. I have custom py ML programs for stocks

hollow sentinel Oct 23, 2020, 12:03 AM

#

lmao dude I can't even figure out the right dataset to use for linear regression

#

I've been looking at Kaggle datasets

grave thunder Oct 23, 2020, 12:05 AM

#

Kaggle is good imo

hollow sentinel Oct 23, 2020, 12:05 AM

#

yeah I need tabular data otherwise it's lots of value_counts

grave thunder Oct 23, 2020, 12:05 AM

#

But depends on what you wanna use ML for, I went through course mostly to automate my day trading and I have constantly updating market that comes in nicely sorted json or csv files

velvet thorn Oct 23, 2020, 12:06 AM

#

lmao dude I can't even figure out the right dataset to use for linear regression
@hollow sentinel what do you mean "right"?

hollow sentinel Oct 23, 2020, 12:06 AM

#

@velvet thorn like I wouldn't know how to do linear regression on a dataset of words

velvet thorn Oct 23, 2020, 12:06 AM

#

ah, okay

#

well there are different things you can do

hollow sentinel Oct 23, 2020, 12:06 AM

#

natural language processing

velvet thorn Oct 23, 2020, 12:07 AM

#

for example, sentiment analysis

#

for text

#

that's the most common case

hollow sentinel Oct 23, 2020, 12:07 AM

#

I also need a dataset that's betwen 50 and 100 KB otherwise seaborn takes too long to make a graph of it

velvet thorn Oct 23, 2020, 12:08 AM

#

??

#

what kind of graph are you making

hollow sentinel Oct 23, 2020, 12:08 AM

#

distplot

#

I don't have the dataset anymore tho 😦

velvet thorn Oct 23, 2020, 12:10 AM

#

that doesn't sound right...

hollow sentinel Oct 23, 2020, 12:13 AM

#

https://www.kaggle.com/arslanali4343/real-estate-dataset

Real Estate DataSet

Dragon Real Estate - Price Predictor

#

found it

oblique socket Oct 23, 2020, 12:15 AM

#

Is there a better way to get the euclidean norm of a row using pandas and numpy?

        for index, row in sums.iteritems():
            df.iloc[index] = df.iloc[index].divide(row)
        return df```

#

preferrably like a one liner

#

well, not just get the norm, but also normalize the row but dividing by the norm.

velvet thorn Oct 23, 2020, 12:17 AM

#

huh.

#

so do you want the norm or not (stored separately)

#

or do you just want to normalise

oblique socket Oct 23, 2020, 12:17 AM

#

I just want to normalize

velvet thorn Oct 23, 2020, 12:17 AM

#

use np.linalg.norm

oblique socket Oct 23, 2020, 12:17 AM

#

I don't really care about the norm

velvet thorn Oct 23, 2020, 12:17 AM

#

although there's a better way to do that

#

sec

oblique socket Oct 23, 2020, 12:17 AM

#

does that work on a row by row basis?

lapis sequoia Oct 23, 2020, 12:18 AM

#

in pandas, I have a column with strings like "foo_2020_10_11", how can I extract that date as a datetime?

oblique socket Oct 23, 2020, 12:18 AM

#

I thought I tried that

velvet thorn Oct 23, 2020, 12:18 AM

#

df / np.linalg.norm(df, axis=1, keepdims=True)

#

@oblique socket

#

in pandas, I have a column with strings like "foo_2020_10_11", how can I extract that date as a datetime?
@lapis sequoia use a regex

#

or rather, pd.to_datetime with a regex

lapis sequoia Oct 23, 2020, 12:19 AM

#

thanks

oblique socket Oct 23, 2020, 12:20 AM

#

Thank you!

velvet thorn Oct 23, 2020, 12:20 AM

#

yw!

oblique socket Oct 23, 2020, 12:20 AM

#

I knew there was a simpler way

velvet thorn Oct 23, 2020, 12:20 AM

#

yup

#

use of iterrows/iteritems/itertuples is very often an antipattern

oblique socket Oct 23, 2020, 12:20 AM

#

I figured, it just seemed like spaghetti

velvet thorn Oct 23, 2020, 12:20 AM

#

try to get used to broadcasting/vectorisation

#

it helps

oblique socket Oct 23, 2020, 12:23 AM

#

I have this function to normalize a dataset

#

    # minmax feature scaling
    if method == 'minmax':
        # scaled value = (value - min) / (max - min)
        # should also return min and max values for future use
        # if new values are added to the dataset
        # normalize on scale [a, b] (default is [0, 1]
        normalized_df = a + (df - df.min())*(b - a)/(df.max() - df.min())
    if method == 'mean_normalization':
        normalized_df = (df - df.mean()) / (df.max() - df.min())
    # z-score normalization (standardization)
    elif method=='standardize':
        # make each feature have zero mean and unit variance
        # should also return mean and std for each attribute
        # for future use in case new values are added to dataset
        # This method is widely used for normalization in many machine
        # learning algorithms (e.g., support vector machines,
        # logistic regression, and artificial neural networks).
        normalized_df=(df-df.mean())/df.std()
    elif method=='unit':
        # x' = x / ||x||
        # sums = df.apply(lambda x: np.sqrt(np.sum(x**2)),axis='columns')
        # for index, row in sums.iteritems():
        #     df.iloc[index] = df.iloc[index].divide(row)
        normalized_df = df / np.linalg.norm(df, axis=1, keepdims=True)
    return normalized_df```

#

It's complete, for now

velvet thorn Oct 23, 2020, 12:24 AM

#

HELP

#

I'M BEING DROWNED IN COMMENTS

#

okay purely from a software engineering perspective

#

this is kind of dodgy IMO

#

I would write one function for each method of feature scaling

oblique socket Oct 23, 2020, 12:29 AM

#

yeah, I guess I could do that

#

I probably should

#

right now I'm the only one using it

velvet thorn Oct 23, 2020, 12:31 AM

#

yup, it's up to you

#

also snake case for function names is much preferred

oblique socket Oct 23, 2020, 12:31 AM

#

oops

#

I guess I'll clean that up!

#

thanks for your input!

tidal bough Oct 23, 2020, 12:37 AM

#

Alternatively, make those docstrings.

#

I'd say it's more important, since comment would take actually going to your code and reading it.

oblique socket Oct 23, 2020, 12:39 AM

#

good point

shell berry Oct 23, 2020, 12:39 AM

#

📎 unknown.png

#

Seems like from the source that you can't do mult-class and multi-label together?

#

Are there any workarounds or other wrappers for this

velvet thorn Oct 23, 2020, 12:50 AM

#

Seems like from the source that you can't do mult-class and multi-label together?
@shell berry what is that from

lapis sequoia Oct 23, 2020, 12:55 AM

#

I need a novelty voice TTS engine with python..
but the only good engine I see is pyttsx3
and microsoft bob is most definitly not a novelty voice...
Im specifically trying to approximate glados
from portal
I found this: https://github.com/EtiennePerot/gladosvoicegen

but it looks terrible... and is 6 years old
and requires melodyne which wont work on my linux server
SuriyawongToday at 4:45 PM
next I found this
https://github.com/kairess/tacotron

but that takes a 130 GB dataset
so... yea thats out

GitHub

EtiennePerot/gladosvoicegen

GLaDOS voice generator - Windows/Melodyne GUI automation code - EtiennePerot/gladosvoicegen

GitHub

kairess/tacotron

Portal GLaDOS voice generator. A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial) - kairess/tacotron

#

OK... alternatively... because I cant find something good...

#

what if I used pyttsx3's microsoft lucy or whatver

velvet thorn Oct 23, 2020, 12:57 AM

#

what limitations do you have

lapis sequoia Oct 23, 2020, 12:57 AM

#

and then distorted it

velvet thorn Oct 23, 2020, 12:57 AM

#

I'm assuming it has to be free?

lapis sequoia Oct 23, 2020, 12:57 AM

#

yes

#

and needs to be fast 5 seconds MAX delay from discord bot command to saying it in VC

velvet thorn Oct 23, 2020, 12:58 AM

#

hm.

#

you ask much 🥴

lapis sequoia Oct 23, 2020, 12:58 AM

#

quad core 4gb ram VPS. No gpu though

velvet thorn Oct 23, 2020, 12:59 AM

#

no GPU
❗

#

okay I guess

#

won't know until you profile it

lapis sequoia Oct 23, 2020, 12:59 AM

#

https://tenor.com/bo3zZ.gif

Tenor

#

*running on a potato

velvet thorn Oct 23, 2020, 12:59 AM

#

hm.

#

just an idea but

#

why don't you use those generic TTS services that have been around since forever

#

you know, the robotic ones

#

and then just have a transformer to make it more GLaDOS-like

lapis sequoia Oct 23, 2020, 1:00 AM

#

and fuck with it to make it distorted

velvet thorn Oct 23, 2020, 1:00 AM

#

ye

lapis sequoia Oct 23, 2020, 1:00 AM

#

yep that was fall back idea

velvet thorn Oct 23, 2020, 1:00 AM

#

I think that'd be more efficient

lapis sequoia Oct 23, 2020, 1:00 AM

#

that looks like what this does

#

http://glados.biringa.com

#

its slow because it uses a GUI tool and automates it anyway

#

but idk how to do that distortion command line

#

and their hacky solution of VM with windows and melodyne is not an option lol

#

https://glados.c-net.org

GLaDOS Voice Generator

Generate GLaDOS-like voice samples from text input

#

that one that actually works does the same thing

#

OK.. I thought GladOS would be easy... perhaps theirs some other novelty tts engine I could use

#

morgan freeman, or snoop dogg, or something else... though that seems way more complex

velvet thorn Oct 23, 2020, 1:10 AM

#

OK.. I thought GladOS would be easy... perhaps theirs some other novelty tts engine I could use
@lapis sequoia I'd say GLaDOS is the easiest because you can just do what we said above

#

it already sounds kinda like TTS

#

on the other hand, a real human's voice is more complex.

#

this is an unsolved problem btw

#

realistic TTS is worth a lot of $$

lapis sequoia Oct 23, 2020, 1:11 AM

#

yea thats what I was thinking

#

hum... well good news...

#

pyttsx3 is so awful it sounds close to glados already

#

NVM thats not pyttsx3 thats espeak

#

not their fault its bad I guess

#

ok other voices actually do decient this could work

#

so... how would I add robotic distortion to a audio file?

lapis sequoia Oct 23, 2020, 1:43 AM

#

oh ffs. I cant even get pyttsx3 to change the fcking voice

#

or volume or anything

lone osprey Oct 23, 2020, 1:45 AM

#

U can

#

My friend did change its voice

#

I don't know what code to change

#

Check in google or docs

lapis sequoia Oct 23, 2020, 1:50 AM

#

ok gtts actually works deciently

#

though Im not thrilled with the delay and external server need

shell berry Oct 23, 2020, 3:05 AM

#

📎 unknown.png

#

I keep getting this error when training, but if I set my test set to like 0.001% it goes away

#

Whenever I try a sizable test test I get the error again. I tried np.unique to make sure I had two classes and I do. Any ideas? appreciated

lapis sequoia Oct 23, 2020, 3:17 AM

#

what is the average statistic that value more present than past in time series?

hasty grail Oct 23, 2020, 3:23 AM

#

Exponential Moving Average?

shell berry Oct 23, 2020, 4:37 AM

#

I fixed that; I ran a SVC which takes like 10 mins to train and gives me ~77% accuracy, but a linear SVC takes 2-3 seconds and gives me 97% around no matter what my test split is

#

Is it really that performant or a false reading?

lapis sequoia Oct 23, 2020, 4:43 AM

#

accuracy of train, test or validation set?
Also are you performing proper splitting?
Try use grid search Cross validation and enter the desired hyper-parameter values for both and compare the results.

shell berry Oct 23, 2020, 4:43 AM

#

Just test @lapis sequoia , Im doing this: python x_train, x_test, y_train, y_test = train_test_split(x_counts, output_labels, test_size=0.33, random_state=100)

dusty depot Oct 23, 2020, 4:44 AM

#

if it's giving 97% accuracy on the test set

#

that's probably okay then

#

linearsvc can converge a lot faster

shell berry Oct 23, 2020, 4:44 AM

#

Something seems off because that's really really high

dusty depot Oct 23, 2020, 4:44 AM

#

is this sklearn?

shell berry Oct 23, 2020, 4:44 AM

#

Yessir

#

I used a randomforest and got ~70%

dusty depot Oct 23, 2020, 4:45 AM

#

hmm

#

try bumping up the test size to like 50% and see what happens maybe?

#

Thonk

shell berry Oct 23, 2020, 4:46 AM

#

Just tried that

lapis sequoia Oct 23, 2020, 4:46 AM

#

linearsvc can converge a lot faster
@dusty depot it can converge lot faster but the results shouldn't be so different.

dusty depot Oct 23, 2020, 4:46 AM

#

oh, no matter what your test split is

#

hm yeah

shell berry Oct 23, 2020, 4:46 AM

#

got like 0.002% higher lol

lapis sequoia Oct 23, 2020, 4:46 AM

#

can you paste the code.

shell berry Oct 23, 2020, 4:46 AM

#

It can't be my data splitting because I''m splitting it the same way for randomforest and etc

#

Not sure if I can paste the entire code, this is for school

#

Oh oops

#

Ok lmao

lapis sequoia Oct 23, 2020, 4:48 AM

#

message me if you like.
I just need to see the splitting part and training

#

also the results.

shell berry Oct 23, 2020, 4:48 AM

#

This is really really embarrassing - I was testing on the train set. I must have changed my code and forgot to change it back

dusty depot Oct 23, 2020, 4:48 AM

#

oh

#

oop

lapis sequoia Oct 23, 2020, 4:48 AM

#

lol.

shell berry Oct 23, 2020, 4:48 AM

#

lol .. 😣

#

Im getting 72% now haha

dusty depot Oct 23, 2020, 4:49 AM

#

rip

shell berry Oct 23, 2020, 4:49 AM

#

Trying a normal SVC now, should get vastly different results since I'm actually doing stuff properly now

#

I've spent 90% of my time on this assignment cleaning the data

#

Are real world projects mostly like that lol

#

Ok a normal SVC gives me 56% now 😦

lapis sequoia Oct 23, 2020, 5:01 AM

#

Welcome to Data Science.
Basic cleaning is nothing.
In some projects I have spent 70% to 80% time in cleaning and cleaning only.
And I'm talking about a multiple month long project.

#

Extraction, Cleaning and Transformation will be the biggest problem in almost every project.

turbid hearth Oct 23, 2020, 5:15 AM

#

📎 unknown.png

#

Does the cross validation graph look good

#

sorry, im new to this and im getting a negative r-squared value compared to a baseline model

#

but it looks like the graph of the model i created plateaus

lapis sequoia Oct 23, 2020, 7:36 AM

#

I don't know what is your CV score but from Loss plot I can say that you are doing something wrong or at least you are not doing something correctly .

Also Baselines are used as a reference point.
Your train and test losss should be lower and R-squared value should be near 1. 0 being the worst and 1 being the best.

hasty grail Oct 23, 2020, 9:04 AM

#

I feel that I am missing something because the Keras callback isn't working. Can someone point that out?

#

def get_pred_loss_dataset(test_dataset: tf.data.Dataset, model: tf.keras.Model) -> Tuple[tf.data.Dataset, tf.data.Dataset]:
    """
    Returns a dataset that yields the prediction and loss for each batch in the test dataset.

    Parameters
    ----------
    test_dataset : Dataset
        The test dataset to evaluate on. Yields `(x_true, y_true, ...)` (batched).
    model : Model
        The model that predicts on the test dataset.

    Returns
    -------
    pred_dataset : Dataset
        The resultant dataset that is suitable to be zipped with `test_dataset`.
        Yields the batch prediction.
    loss_dataset : Dataset
        The resultant dataset that is suitable to be zipped with `test_dataset`.
        Yields the batch loss.
    """
    pred_dataset = test_dataset.map(lambda x_true, *_: model(x_true))

    print(test_dataset)
    print("Obtaining loss values...")
    losses = []
    def on_batch_end(batch, logs):
        print(f"batch: {batch}, loss: {logs['loss']}")
        losses.append(logs['loss'])
    log_batch_loss = tf.keras.callbacks.LambdaCallback(on_batch_end=on_batch_end)
    results = model.evaluate(test_dataset, callbacks=[log_batch_loss])
    print(results, losses)
    loss_dataset = tf.data.Dataset.from_tensor_slices(tf.stack(losses))

    return pred_dataset, loss_dataset

#

Console output (yes I know the model sucks, that's why I am looking at where it went wrong):

<BatchDataset shapes: ((1, 512, 512, 3), (1,), (1,)), types: (tf.float16, tf.int32, tf.float16)>
Obtaining loss values...
3965/3965 [==============================] - 578s 146ms/step - loss: 4.4005 - top_1_accuracy: 0.0504 - top_3_accuracy: 0.0918 - top_5_accuracy: 0.2683
[4.40053129196167, 0.05044136196374893, 0.09180327504873276, 0.2683480381965637] []

#

As you can see the print statement in the callback isn't being called

#

I feel dumb for still not seeing the mistake after staring at the code for several minutes

#

ok I still have no idea lol

#

oh

#

I'm such an idiot

#

on_batch_end

A backwards compatibility alias for on_train_batch_end.

#

from the docs

#

so LambdaCallback is useless when not training

#

Nowhere in the docs for LambdaCallback was this mentioned

#

It's only mentioned in the base class Callback

rose swift Oct 23, 2020, 9:55 AM

#

hi

grave thunder Oct 23, 2020, 10:03 AM

#

Quick pandas question. Say I have DataFrame

 col1      col2
A a1 a2
B b1 b2
How do I check row B, column 1 if it has value b1 and if it does, drop that whole row? I tried with df.drop(df.loc[df["col1"] == "b1"]) but it doesn't work

keen sinew Oct 23, 2020, 10:19 AM

#

hi

#

can anyone help me out with this?

#

📎 Screenshot_2020-10-23_at_11.34.11_AM.png

lapis sequoia Oct 23, 2020, 1:04 PM

#

Quick pandas question. Say I have DataFrame
How do I check row B, column 1 if it has value b1 and if it does, drop that whole row? I tried with df.drop(df.loc[df["col1"] == "b1"]) but it doesn't work
@grave thunder If you know its rows B then you can directly drop it. using df.drop(index = 'B').
Or
index_to_drop = df[df['col1'] == "b1"].index
df.drop(index = index_to_drop )

#

also you have to make inplace = True if you want to reflect the changes.

#

can anyone help me out with this?
@keen sinew Well there is no code. But it means you are missing some imports or there is a version clash which is not directly obvious. There can be other reasons too.

rain stone Oct 23, 2020, 1:30 PM

#

Ma I get the algo info here?

somber bane Oct 23, 2020, 1:50 PM

#

hello, I am still new to algorithm.
I am planning to build a recommendation system, maybe just a basic one
Can any one give me some helpful recommendation on how should I start and what method should I use, things like that.
I plan to use the feedback and ratings for others users as the data for the recommendation

velvet thorn Oct 23, 2020, 3:28 PM

#

Quick pandas question. Say I have DataFrame
How do I check row B, column 1 if it has value b1 and if it does, drop that whole row? I tried with df.drop(df.loc[df["col1"] == "b1"]) but it doesn't work
@grave thunder df = df[df['col_1'] != b1]

grave thunder Oct 23, 2020, 3:36 PM

#

@velvet thorn You save me once again

velvet thorn Oct 23, 2020, 3:37 PM

#

np

regal belfry Oct 23, 2020, 3:57 PM

#

whats a good deep learning home workstation?

lapis sequoia Oct 23, 2020, 4:03 PM

#

Anyone here use Spyder IDE? How good is it in proceding eye-pleasing visual results?

#

producing*

limpid oak Oct 23, 2020, 4:19 PM

#

if you have python background and wants to practice google earth engine, which one is comfertable, using python lib in conda or js on gee platform?

#

please suggest

coral trellis Oct 23, 2020, 4:38 PM

#

Hi guys I wonder a thing. Which libraries are most use for NLP? PyTorch or TF-Keras?

hollow sentinel Oct 23, 2020, 5:13 PM

#

@lapis sequoia Idk but I find the SpyderIDE kinda ugly. If you're doing data science I would recommend Jupyter Notebook

lapis sequoia Oct 23, 2020, 5:54 PM

#

Anyone here use Spyder IDE? How good is it in proceding eye-pleasing visual results?
@lapis sequoia Personally I'm a big fan of the RStudio IDE which used to be solely for R but has recently gained support for Python in the preview version 1.4. I'm not sure it's as feature complete as for R but it's getting there.

#

I have tried using Spyder as well but I just couldn't get used to it. The problem I have with Jupyter notebooks is that I can't readily see what variables I defined and what they look like and there is no real data browser.

real geode Oct 23, 2020, 6:19 PM

#

just use Jupyter on Visual Studio Code

#

it shows you the variables and has a data explorer

bitter harbor Oct 23, 2020, 6:44 PM

#

RStudio is great (minus the fact that it's r) but the desktop version that comes with anaconda feels like it's lacking for some reason

#

I had been using spyder for a while and it was pretty good

hollow sentinel Oct 23, 2020, 6:57 PM

#


brand_of_car = car_data.groupby('brand')['model'].count().reset_index().sort_values('model',ascending = False).head(10)
brand_of_car = brand_of_car.rename(columns = {'model':'count'})
fig = px.bar(brand_of_car, x='brand', y='count', color='count')
fig.show()

#

guys what is groupby

#

I got it from this kaggle notebook https://www.kaggle.com/tanersekmen/us-car-data-analysis-eda-visualization

US Car Data Analysis,EDA,Visualization

Explore and run machine learning code with Kaggle Notebooks | Using data from US Cars Dataset

grave frost Oct 23, 2020, 6:57 PM

#

@hollow sentinel Just google it bro

hollow sentinel Oct 23, 2020, 6:57 PM

#

i did

#

it's grouping data

#

but like what does that mean

#

how does grouping the data help

grave frost Oct 23, 2020, 6:59 PM

#

@coral trellis It depends on what you want to do. I recommend TF if you want to implement some DL paper published by Google and have a SOTA model for your task. Else just find a tutorial that covers all the theory in ML and learn that first before diving into text generation, etc.

#

@regal belfry Depends on what kind of tasks you want to do 🙂 for most people, a GTX 1050ti would work well enough (since you would be using colab for heavy tasks anyway)

#

@somber bane For what data do you want the recommendation system? What is your tentative metric for that data type?

hollow sentinel Oct 23, 2020, 7:05 PM

#

what's the difference between using df["column"] v df.column

somber bane Oct 23, 2020, 7:07 PM

#

@grave frost I was ask user to give a 1-10 scale of rating on shows. And then base on the average rating system along with the type of genre, maybe also on user's age, recommend them shows

analog hatch Oct 23, 2020, 7:08 PM

#

@hollow sentinel groupby is to group similar attributes in a column that is why its usefull

#

and df["columns"] depends if u have a column name column

grave frost Oct 23, 2020, 7:08 PM

#

@somber bane Hmm.. seems workable. How much accuracy should it have? Like is it for personal use or you want to use it in a real world scenario?

analog hatch Oct 23, 2020, 7:09 PM

#

df.columns shows your columns in the dataframe

somber bane Oct 23, 2020, 7:09 PM

#

I am building this for my freshman computer science project

#

but I plan to publish it for public use

#

so maybe as accurate as possible, but does not require

grave frost Oct 23, 2020, 7:10 PM

#

If the project is all about a recommendation system, I recommend you use some industry-level algo instead of implementing your own. However, if your teacher expects a custom system, then that's a different story..

heady hatch Oct 23, 2020, 7:11 PM

#

You can write a basic one, given that you know how to use numpy and linear algebra.

somber bane Oct 23, 2020, 7:11 PM

#

@grave frost so do you have any recommend industry level ones

#

I just learned to use numpy and pandas, so where should I start. I mean I need some help in setting up a basic picture and working frame behind the algo

grave frost Oct 23, 2020, 7:13 PM

#

If you are sure that the project's goal is not to make your own custom algo, then industry ones are obv good enough. How does your data look like?

somber bane Oct 23, 2020, 7:13 PM

#

for right now I have not start to collect data from user's yet

grave frost Oct 23, 2020, 7:13 PM

#

If it was me, I would be developing some method based on simple ML techniques

somber bane Oct 23, 2020, 7:14 PM

#

what is ML standand for?

grave frost Oct 23, 2020, 7:14 PM

#

But it may require some good amount of coding and thoery

#

ML- Machine Learning

somber bane Oct 23, 2020, 7:15 PM

#

oh,

grave frost Oct 23, 2020, 7:15 PM

#

But don't worry, it would just be a bit of maths

#

not actual ML

somber bane Oct 23, 2020, 7:15 PM

#

I recently look up at this website

#

https://www.analyticsvidhya.com/blog/2018/06/comprehensive-guide-recommendation-engine-python/?#

Analytics Vidhya

Pulkit Sharma

Comprehensive Guide to build Recommendation Engine from scratch

This is a comprehensive guide to building recommendation engines from scratch in Python. Learn to build a recommendation engine using matrix factorization.

#

I think I can understand most of it, but I do believe my teacher hopes me to build one by myself, not just do some copy and paste

heady hatch Oct 23, 2020, 7:16 PM

#

To clarify when you say your teacher hopes you build one yourself, do they mean implement a library or algo vs using some library?

somber bane Oct 23, 2020, 7:17 PM

#

algo

heady hatch Oct 23, 2020, 7:17 PM

#

Like is it okay to

import recommendation_engine
recommendation_engine.fit()
recommendation_engine.predict()

violet veldt Oct 23, 2020, 7:17 PM

#

guys, i want to split x axis labels into years, and labelless ticks between them, how can i do that?

📎 unknown.png

somber bane Oct 23, 2020, 7:18 PM

#

I think he will be okay with it, since I am only a freshman

heady hatch Oct 23, 2020, 7:19 PM

#

Okay well, @grave frost might have more libraries. But couple that comes to my mind is Surprise and ALS from spark.

somber bane Oct 23, 2020, 7:20 PM

#

so do I go ahead and study on how should I use those library, and implement them?

heady hatch Oct 23, 2020, 7:20 PM

#

But recommendation problems is a bit tricky to begin with since it requires you to have an understanding of what the algorithm is doing.

But like previously stated, it only requires a bit of understanding.

#

Implementing them from scratch might be a bit rough.

#

But I think it would be useful to look at source code to see how they're implemented.

#

If you want something easy to digest and start with, you can grab your data and then recommend items most popular by ratings.

somber bane Oct 23, 2020, 7:22 PM

#

so can you recommend one library that is friendly to beginner

heady hatch Oct 23, 2020, 7:23 PM

#

I think Surprise is pretty friendly.

#

I think it's the whole problem that requires a bit of understanding.

#

Once you understand the problem, the library is just tools to help solve it.

somber bane Oct 23, 2020, 7:24 PM

#

I will workhard on the understanding part, could you help me find some sources that you think is helpful for me to begin learning with Surprise. Thanks

heady hatch Oct 23, 2020, 7:24 PM

#

I think the link you used talked about it.

#

https://developers.google.com/machine-learning/recommendation

Google Developers

Introduction | Recommendation Systems | Google Developers

#

Here's one from google.

somber bane Oct 23, 2020, 7:25 PM

#

Thank you very much! @heady hatch

heady hatch Oct 23, 2020, 7:26 PM

#

But if you built a basic one like just recommending stuff based on popularity, you won't really need much of anything other than maybe data science.

ie
-> group by some category
-> grab top 10 items in those categories

somber bane Oct 23, 2020, 7:27 PM

#

no, I ask from my professor for something that is challenging. Because I experienced on how to program before. So his purpose is not to keep me boring in the class

heady hatch Oct 23, 2020, 7:27 PM

#

Ahh okay then, check out those resources and have fun!

somber bane Oct 23, 2020, 7:28 PM

#

😀 I might come back with more questions @heady hatch

#

Thanks a lot

hollow sentinel Oct 23, 2020, 7:32 PM

#

📎 unknown.png

#

to think I wanted to make a linear regression out of this

heady hatch Oct 23, 2020, 7:33 PM

#

Looks like there could be some kind of relationship there! Maybe under log transformation?

#

Or maybe no relationships at all.

hollow sentinel Oct 23, 2020, 7:34 PM

#

dk what that is

heady hatch Oct 23, 2020, 7:34 PM

#

Me neither.

hollow sentinel Oct 23, 2020, 7:34 PM

#

but def not linear regression

#

I'm gonna do it anyways for the learning experience

heady hatch Oct 23, 2020, 7:34 PM

#

Do it!

#

btw daspecito, I noticed you're unfamiliar with Python. It's great that you're eager to do data science but I think it's important to be familiar with Python first.

hollow sentinel Oct 23, 2020, 7:35 PM

#

I took my college CS course when I was a cs major

heady hatch Oct 23, 2020, 7:35 PM

#

Once you get familiar with Python more, do a bit of data science, check out other's notebooks, and then back and forth.

#

Okay, that's good.

hollow sentinel Oct 23, 2020, 7:36 PM

#

I'm just rusty lol

heady hatch Oct 23, 2020, 7:36 PM

#

But it doesn't seem to help your unfamiliarity with Python.

hollow sentinel Oct 23, 2020, 7:36 PM

#

where am I unfamiliar

#

a lot of places

#

hahhahha

heady hatch Oct 23, 2020, 7:36 PM

#

hahaha

#

I'm not saying you're terrible at programming.

hollow sentinel Oct 23, 2020, 7:36 PM

#

I don't know OOP

heady hatch Oct 23, 2020, 7:36 PM

#

Just should be more familiar with Python syntax.

hollow sentinel Oct 23, 2020, 7:37 PM

#

that is something I should learn

heady hatch Oct 23, 2020, 7:37 PM

#

Because when you jump into data science, you don't want to be dealing with both Python and Data Science. Since both topics are quite wide.

#

You just want to focus on getting the information you want rather than dealing with syntax troubles.

hollow sentinel Oct 23, 2020, 7:38 PM

#

oh like that one time I got confused and kept writing state

heady hatch Oct 23, 2020, 7:38 PM

#

Ye.

hollow sentinel Oct 23, 2020, 7:38 PM

#

yeah that was dumb

#

Well I was newer to pandas then

#

I make a lot of mistakes starting out

#

I'll get better

heady hatch Oct 23, 2020, 7:40 PM

#

💪 I know you will.

hollow sentinel Oct 23, 2020, 7:41 PM

#

so do you most commonly use a jointplot to see if there's a relationship?

#

between two variables?

#

jointplot in seaborn I mean

heady hatch Oct 23, 2020, 7:43 PM

#

Yea, jointplot works. I think I also use pairplot.

#

https://seaborn.pydata.org/generated/seaborn.pairplot.html

hollow sentinel Oct 23, 2020, 7:44 PM

#

yeah my pairplot isn't very promising either

#

not very surprised

#

df.drop(['Unnamed: 0','vin'],axis=1,inplace=True)

#

what does axis = 1 control

heady hatch Oct 23, 2020, 7:46 PM

#

Second axis.

hollow sentinel Oct 23, 2020, 7:47 PM

#

but what does second axis mean

heady hatch Oct 23, 2020, 7:47 PM

#

if you have row and col, it's col.

hollow sentinel Oct 23, 2020, 7:47 PM

#

oh

heady hatch Oct 23, 2020, 7:47 PM

#

It's referring to dim.

hollow sentinel Oct 23, 2020, 7:47 PM

#

hahhhahahha linear alg also don't know that

heady hatch Oct 23, 2020, 7:48 PM

#

Because matrix can have more than 2 dim.

hollow sentinel Oct 23, 2020, 7:48 PM

#

I may need to take a linear algebra course

heady hatch Oct 23, 2020, 7:48 PM

#

It would help.

#

Or if you're going to be working mainly with 2 dimensional data, you could look into data science courses on MOOC.

bitter harbor Oct 23, 2020, 7:49 PM

#

3b1b has a pretty good series on la

#

that's where I learnt it from

#

the basics of how nn's work

#

calc if you need it

#

really if I need anything math related I go to him first

#

super nice guy too

hollow sentinel Oct 23, 2020, 7:51 PM

#

very cool @bitter harbor I'll look at his stuff

#

thanks

bitter harbor Oct 23, 2020, 7:52 PM

#

i'd suggest taking notes, la gets pretty heavy pretty quick

#

but the comments are full of people complaining that he taught the subject better in a video than their profs in a semester

heady hatch Oct 23, 2020, 7:53 PM

#

I wonder why they don't use comment to talk about LA stuff, I think it would be a better use of their time.

bitter harbor Oct 23, 2020, 7:54 PM

#

it's yt?

heady hatch Oct 23, 2020, 7:54 PM

#

hahaha

hollow sentinel Oct 23, 2020, 8:03 PM

#

carsData.drop([labels = "vin","lot"],axis=1)

#

I'm trying to drop columns "vin" and "lot"

#

and it says I have a syntax error unsurprisingly

analog hatch Oct 23, 2020, 8:06 PM

#

if u makign linear regression on data u need to find a closer correlation between certain columns

#

try using corr()

#

is a great way to find what is the best one

hollow sentinel Oct 23, 2020, 8:07 PM

#

thank you @analog hatch

analog hatch Oct 23, 2020, 8:07 PM

#

btw each factor of a data is important if u puting it in as the X_train data

#

dont exclude anything other then columns with strings

hollow sentinel Oct 23, 2020, 8:08 PM

#

so don't take out the lot or the vin number

#

some kaggle notebooks did that so I was wondering if I should do it too

analog hatch Oct 23, 2020, 8:09 PM

#

I mean if u are doing a tutorial might as well but data is important even if their is barely correlation ofcourse correlation would always be the main factor for the outcome

hollow sentinel Oct 23, 2020, 8:10 PM

#

makes sense

#

yeah I'm just looking how they decide to visualize the data to make it look good

analog hatch Oct 23, 2020, 8:11 PM

#

I mean u are doing it good with seaborn and matplotlib

hollow sentinel Oct 23, 2020, 8:11 PM

#

yeah

analog hatch Oct 23, 2020, 8:11 PM

#

u can try plotly is more 3 dimnesional

#

if u want to use it

hollow sentinel Oct 23, 2020, 8:11 PM

#

pairplot, distplot, lmplot does the job pretty well

analog hatch Oct 23, 2020, 8:11 PM

#

yeah pairplot works similar as .cor()

hollow sentinel Oct 23, 2020, 8:12 PM

#

i think i've seen .corr() before

analog hatch Oct 23, 2020, 8:12 PM

#

I am not an expered at this but i do like ML and DL

#

XD I enjoy that shit

hollow sentinel Oct 23, 2020, 8:12 PM

#

hahaahhahahahh I am nowhere near an expert

#

I just started DS & ML like 2 weeks ago

analog hatch Oct 23, 2020, 8:12 PM

#

yeah it must be hard

#

dammm thats good thou

hollow sentinel Oct 23, 2020, 8:13 PM

#

yeah I've been doing a udemy course

analog hatch Oct 23, 2020, 8:13 PM

#

yoooo me too

#

udemy is the best

#

I been doing on and off for a year

hollow sentinel Oct 23, 2020, 8:13 PM

#

python for data science and machine learning bootcamp?

analog hatch Oct 23, 2020, 8:13 PM

#

yess great course

#

deep learning courses also

hollow sentinel Oct 23, 2020, 8:13 PM

#

love it jose Portilla is amazing

analog hatch Oct 23, 2020, 8:14 PM

#

yeahh he is really good at explaining the basic of it

hollow sentinel Oct 23, 2020, 8:14 PM

#

yeah the way he gives you answers with documented code is great

#

so you can follow along

#

I always have his stuff open when I do my own work bc it's a good guide

analog hatch Oct 23, 2020, 8:15 PM

#

True ones u finish doing data analysis the machine learning parts gets better

#

the deep learning is my favor one

#

is like 5 hours of course

#

but its great

hollow sentinel Oct 23, 2020, 8:15 PM

#

I will try it

analog hatch Oct 23, 2020, 8:15 PM

#

yeahh dude do it its worth it

#

you can do so much with ML

hollow sentinel Oct 23, 2020, 8:17 PM

#

yeah my friend keeps telling me to switch to hacking and i'm like

#

📎 2Q.png

analog hatch Oct 23, 2020, 8:17 PM

#

True XD

#

ML is more modern

hollow sentinel Oct 23, 2020, 8:17 PM

#

only after I completely master machine learning and make bank

analog hatch Oct 23, 2020, 8:18 PM

#

you can be more creative

hollow sentinel Oct 23, 2020, 8:18 PM

#

I just find it cool

analog hatch Oct 23, 2020, 8:18 PM

#

yeah me too

hollow sentinel Oct 23, 2020, 8:18 PM

#

I want to create algorithms that help predict cancer

#

that's what I find cool

analog hatch Oct 23, 2020, 8:19 PM

#

actually with the basic u can make one easily with enough data

hollow sentinel Oct 23, 2020, 8:19 PM

#

yeah but like an insanely accurate one

#

you don't even need to create an algorithm for that tbh

#

it's a classification problem

analog hatch Oct 23, 2020, 8:20 PM

#

that takes time ofcourse cleaning and difying to make it almost perfect in the course you would learn how u are supposed to control your data so it does not overfit or underfit your results

hollow sentinel Oct 23, 2020, 8:20 PM

#

yep

analog hatch Oct 23, 2020, 8:20 PM

#

oo yeah thats logistic regression

hollow sentinel Oct 23, 2020, 8:20 PM

#

haven't learned that yet

#

figured I'd do some linear regression on my own and then hop back into the course

analog hatch Oct 23, 2020, 8:21 PM

#

cool cool i mean your in the right path

#

take your time and enjoy what u doing

hollow sentinel Oct 23, 2020, 8:22 PM

#

definitely

#

I improve every day

analog hatch Oct 23, 2020, 8:22 PM

#

cool cool you got any question I would be gladly to help

#

u can just dm it to me

hollow sentinel Oct 23, 2020, 8:23 PM

#

great man I really appreciate it

#

all of you guys have been very supportive

heady tide Oct 23, 2020, 8:25 PM

#

Displaying the tf-idf vector of a pdf file using Pyqt

📎 unknown.png

#

it's so cool that this method actually knows when a word is overused and lowers it's weight accordingly

hollow sentinel Oct 23, 2020, 8:35 PM

#

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=101)

#

ValueError: With n_samples=1, test_size=0.3 and train_size=None, the resulting train set will be empty. Adjust any of the aforementioned parameters.

#

uhhhhhh

#

X = [["Price", 'lot', 'year' ]]
#What you're using to predict the mileage
y = ["Mileage"]
#Trying to predict the mileage based on the price

#

that's what X and y are equal to

#

I got no clue

#

omg guys I think I figured out my first machine learning error

errant parcel Oct 23, 2020, 9:03 PM

#

pyqtgraph is datascience right!

#

📎 unknown.png

#

what the fresh hell is going on here

#

i assumed it was some 'first argument is self' business

#

but

#

wot

hollow sentinel Oct 23, 2020, 9:05 PM

#

never heard of pyktgraph @errant parcel but I'm new to DS/ML

errant parcel Oct 23, 2020, 9:05 PM

#

noice

#

what was your error in the end

#

and it turns out the issue with that is just that they accidentally made something optional that shouldn't have been

hollow sentinel Oct 23, 2020, 9:06 PM

#

I didn't have the dataframe I was getting the column out . of

errant parcel Oct 23, 2020, 9:06 PM

#

so it's possible to pass no arguments and it needs arguments

#

oh

hollow sentinel Oct 23, 2020, 9:06 PM

#

so I was just putting in a list

#

lmao

#

that's the good type of error

quick epoch Oct 23, 2020, 10:14 PM

#

Hi guys, does anyone know Matplotlib and is willing to help me out XD?

austere swift Oct 23, 2020, 10:20 PM

#

just ask your question

hollow sentinel Oct 23, 2020, 10:28 PM

#

^

quick epoch Oct 23, 2020, 10:31 PM

#

Sorry, so, I want to produce a histogram that show how the different traffic levels impact the light changes. I got all the data. I kinda know what to do but I am struggling with making a multicolour curve and histogram

velvet thorn Oct 23, 2020, 11:27 PM

#

Sorry, so, I want to produce a histogram that show how the different traffic levels impact the light changes. I got all the data. I kinda know what to do but I am struggling with making a multicolour curve and histogram
@quick epoch what do you mean multicolour?

#

got an example?

#

so it's possible to pass no arguments and it needs arguments
@errant parcel my guess is that it wraps a C library so it can't do argument checking on the Python side...?

#

but honestly the message looks p self-explanatory

errant parcel Oct 23, 2020, 11:29 PM

#

well I'm not passing None

#

i'm passing nothing

#

my guess is that all arguments are optional, but when it calls addHandle it fails to actually provide valid default values

velvet thorn Oct 23, 2020, 11:31 PM

#

my guess is that all arguments are optional, but when it calls addHandle it fails to actually provide valid default values
@errant parcel don't have enough experience to say, but one way of implementing overloads is to have None as default arguments

#

🤷‍♂️

errant parcel Oct 23, 2020, 11:32 PM

#

yep but i think the combinations that it allows are wrong

velvet thorn Oct 23, 2020, 11:32 PM

#

yeah, that's possible

#

just throwing in my utterly uninformed two cents

#

it's a classification problem
@hollow sentinel it being a classification problem doesn't mean a new algorithm/architecture wouldn't be appropriate/necessary

hollow sentinel Oct 23, 2020, 11:36 PM

#

true @velvet thorn

quick epoch Oct 23, 2020, 11:37 PM

#

@velvet thorn

📎 image0.png

#

Something like this

#

But I just want a single line

velvet thorn Oct 23, 2020, 11:40 PM

#

But I just want a single line
@quick epoch that is a single line

#

or do you mean a single plot?

#

anyway, I believe that's from an official MPL example, right?

#

so you should just be able to follow it

hollow sentinel Oct 24, 2020, 12:21 AM

#

so there's no graphs you create after a logistic regression

#

after you create the model you just do the classification report and that's how you can judge how the model did

#

right?

hollow sentinel Oct 24, 2020, 12:40 AM

#

also has anyone's tab shift tab to see jupyter doc stop working?

#

mine doesn't work at times and I don't get why

velvet thorn Oct 24, 2020, 12:41 AM

#

after you create the model you just do the classification report and that's how you can judge how the model did
@hollow sentinel that's a start

#

but there are many other things you can do

#

look into lift

#

calibration

#

ROC-AUC score

#

PR curve

hollow sentinel Oct 24, 2020, 12:42 AM

#

oh my udemy course didn't mention those haha probably bc it's introductory

#

idk why my tab shift isn't working

#

📎 unknown.png

#

when you have X why do we use a list inside of a list

velvet thorn Oct 24, 2020, 12:46 AM

#

when you have X why do we use a list inside of a list
@hollow sentinel because X must be 2D

hollow sentinel Oct 24, 2020, 12:46 AM

#

bc train_test_split requires it?

velvet thorn Oct 24, 2020, 12:46 AM

#

no

#

because otherwise

#

how would you tell the difference between N samples with 1 feature and 1 sample with N features?

hollow sentinel Oct 24, 2020, 12:47 AM

#

uhhhhhh

#

feature?

velvet thorn Oct 24, 2020, 12:48 AM

#

yes

hollow sentinel Oct 24, 2020, 12:48 AM

#

oh ok i see

#

yeah there'd be no other way

#

thanks @velvet thorn that was a question that was bothering me haha

velvet thorn Oct 24, 2020, 12:48 AM

#

yw

quick epoch Oct 24, 2020, 9:17 AM

#

Yeah I tried but I did if statements to change the colours whenever a certain value appears. I will show you what I mean in a sec

lapis sequoia Oct 24, 2020, 9:17 AM

#

Could anyone please be so kind to explain what I am doing wrong with my plot for my models?

📎 Screenshot_2020-10-24_at_11.16.22.png

paper niche Oct 24, 2020, 9:19 AM

#

you have 5 bars but are trying to set 7 tick labels? @lapis sequoia

lapis sequoia Oct 24, 2020, 9:21 AM

#

you have 5 bars but are trying to set 7 tick labels? @lapis sequoia
@paper niche How do I change this?

paper niche Oct 24, 2020, 9:22 AM

#

your names list has 7 items in it

#

reduce it to 5 items. I see you have Logistic Regression and Decision Tree repeated twice. I suppose that isn't intentional?

lapis sequoia Oct 24, 2020, 9:23 AM

#

Oops

#

your names list has 7 items in it
@paper niche I've been staring myself blind on this, thanks!

paper niche Oct 24, 2020, 9:24 AM

#

yep, no sweat

lapis sequoia Oct 24, 2020, 9:24 AM

#

yep, no sweat
@paper niche Is there a way to get the percentage shown in each bar?

#

For example like this

📎 Screenshot_2020-10-24_at_11.25.06.png

paper niche Oct 24, 2020, 9:26 AM

#

Something like this: https://stackoverflow.com/a/28931750

Stack Overflow

Adding value labels on a matplotlib bar chart

I got stuck on something that feels like should be relatively easy. The code I bring below is a sample based on a larger project I'm working on. I saw no reason to post all the details, so please a...

hazy field Oct 24, 2020, 11:31 AM

#

hey,
i tried vgg2 face for face verification. and i was wondering, can we detect that the face is, in fact, a real human face, not some cut-out face print on cardboard ?
is there any research on this or any library that i can use?

#

hmm, i think i got the keyword, liveness detection!

sour cradle Oct 24, 2020, 1:50 PM

#

I'm trying to convert my data into a format for ml. It gives an example where it uses audio from util, but I can't find anything about that online. Is there a drop-in replacement for that library?

#

never mind, it was a custom library

hollow sentinel Oct 24, 2020, 3:29 PM

#

so I was following this kaggle notebook https://www.kaggle.com/rtatman/data-cleaning-challenge-handling-missing-values

Data Cleaning Challenge: Handling missing values

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

#

and I did this

#

total_cells = np.product(nfl_data.shape)
total_missing = missing_values_count.sum()

# percent of data that is missing
(total_missing/total_cells) * 100

#

on my dataset and I got that my data set missing 95% of it's data

#

did I do something wrong? is it possible to have a dataset that's missing 95% of its data?

lapis sequoia Oct 24, 2020, 3:41 PM

#

I'm wondering why parameter tuning with gridsearchCV is giving me worse accuracy than the default?

livid tundra Oct 24, 2020, 4:12 PM

#

I hope it's not a dumb question; where can one find interesting csv files to practice visualization/analysis as a beginner?

hollow sentinel Oct 24, 2020, 4:25 PM

#

@livid tundra not at all. kaggle is great for that. There are notebooks that will teach you data cleaning, data visualization, and machine learning.

livid tundra Oct 24, 2020, 4:54 PM

#

Thank you very much!

hollow sentinel Oct 24, 2020, 4:55 PM

#

@livid tundra no problem

#

why is there no accuracy shown in the classification report

📎 unknown.png

shell berry Oct 24, 2020, 4:57 PM

#

How do you fit multiple features into a model? Would it be like [[f1, f2], [f1,f2]] etc. where [f1, f2] is one training example with two features?

tidal bough Oct 24, 2020, 5:06 PM

#

each example would be a vector, yes

#

usually the input is a matrix of shape (n_samples,n_features)

shell berry Oct 24, 2020, 5:06 PM

#

Thank you

tidal bough Oct 24, 2020, 5:06 PM

#

so each row vector is a single datapoint

shell berry Oct 24, 2020, 5:07 PM

#

What if feature 1 is a bag of words and feature 2 is a bag of POS tags? [[1,0,1,0,1],[0,0,0,0,0,1]] or something. Is this a good feature vector? It seems intuitively hard to "graph" these as one point

tidal bough Oct 24, 2020, 5:08 PM

#

why not? That's a lot of binary features.
More generally, a feature is anything your model accepts 😛

shell berry Oct 24, 2020, 5:10 PM

#

Cool thanks lol

#

What do you think would be better, a bag of bigrams of (word, pos_tag) or bag of words followed by a bag of POS tags?

#

Im sure it depends on the scenario but any thoughts?

tidal bough Oct 24, 2020, 5:13 PM

#

the former seems to make more sense, though depending on the model it might not matter

#

like, if the model knows there's a correspondence between the two bags, it's the same as if they were already in bigrams

shell berry Oct 24, 2020, 5:21 PM

#

Thanks @tidal bough, Ill try both just to experiment 🙂 I now have a list of list of tuples, where each inner list is a sentence and each tuple is (word, tag). However, I can't use countvectorizer or tfid now. Is there another way to make it an input vector, or should I convert the tuples to strings?

bold olive Oct 24, 2020, 5:24 PM

#

Any reason why mean accuracy from cross_val_score and manually average will be different?

heady hatch Oct 24, 2020, 6:06 PM

#

What do you mean by manually average?

#

@hollow sentinel what do you mean by no accuracy?

bold olive Oct 24, 2020, 6:33 PM

#

@heady hatch outputting separate accuracies for each fold and then taking the mean of it.

hollow sentinel Oct 24, 2020, 7:03 PM

#

like accuracy is blank in the pic I sent

heady hatch Oct 24, 2020, 8:02 PM

#

@hollow sentinel if you take a look at accuracy in the third column, it shows 0.84.

#

@bold olive So from my understanding are you taking the accuracy of predicted train and predicted val and taking the mean?

bold olive Oct 24, 2020, 8:08 PM

#

Yes, exactly. Accuracy from each fold and then averaged in the end.

hollow sentinel Oct 24, 2020, 8:11 PM

#

can someone explain what StandardScaler is and why you need to do it on your dataset before you run k nearest neighbors

#

also how is k nearest neighbors regression different from linear regression

heady hatch Oct 24, 2020, 8:16 PM

#

@bold olive I believe cross_val_score uses Kfold validation.

#

Meaning that it trains it on training set and then score it on validation set.

#

@hollow sentinel You want to scale the features before running KNN because KNN takes distance into account. Scales of different features will throw these distance calculation off.

KNN vs linear regression, I recommend reading on the two algorithms.

In short, KNN uses k neighbors to calculate the score. While linear regression uses a linear model, ie y = mx + b.

hollow sentinel Oct 24, 2020, 8:19 PM

#

i think it's time to break out the Intro to Statistical Learning

heady hatch Oct 24, 2020, 8:23 PM

#

Good luck.

hollow sentinel Oct 24, 2020, 8:26 PM

#

It's so boring to read

heady hatch Oct 24, 2020, 8:27 PM

#

It might help to apply the concepts to real life.

bold olive Oct 24, 2020, 8:28 PM

#

@heady hatch , we can change the validation technique in the function.

heady hatch Oct 24, 2020, 8:28 PM

#

What do you mean by in the function?

bold olive Oct 24, 2020, 8:29 PM

#

cross_val_score(X, y, cv)

#

cv can be anything we declare.

heady hatch Oct 24, 2020, 8:31 PM

#

Did you set anything there?

#

Because I think by default, it uses kfold.

bold olive Oct 24, 2020, 8:33 PM

#

Yes, I'm using stratified shuffle split and calling it.

#

Even if I use KFold the problem is that the accuracies are different!

#

cross_val_score is reliable right?

heady hatch Oct 24, 2020, 8:36 PM

#

it just a function that does cross validation. hahaha

bold olive Oct 24, 2020, 8:36 PM

#

Yeah ik I meant the way it calculates the accuracy metric

heady hatch Oct 24, 2020, 8:36 PM

#

Yup.

bold olive Oct 24, 2020, 8:37 PM

#

Something probably wrong in my manual approach then!

heady hatch Oct 24, 2020, 8:37 PM

#

Hmm one question to ask is why are you looking the accuracy of training set?

bold olive Oct 24, 2020, 8:37 PM

#

Uh, not the training, test

#

Or does crossval compute training accuracies?

heady hatch Oct 24, 2020, 8:39 PM

#

It does not.

#

I was wondering because you said you took the average of training and validation.

bold olive Oct 24, 2020, 8:39 PM

#

No! The average over all folds

heady hatch Oct 24, 2020, 8:40 PM

#

So you trained the model on the training set, scored it on the validation set and then took the average of all the validation score?

bold olive Oct 24, 2020, 8:40 PM

#

Yes

heady hatch Oct 24, 2020, 8:40 PM

#

Ahh.

#

Hmm the other factor I would probably consider is maybe the splitting through each fold.

#

Might not be splitting the same way.

#

I think something you can try is write your own splitting function

seed it
do cross_val_score using your own splitting function
seed it again
split it the same way each fold.

shell berry Oct 25, 2020, 12:12 AM

#

Does anyone here versed well in scikit learn offering paid tutoring services?

hollow sentinel Oct 25, 2020, 12:50 AM

#

@shell berry you can try finding a udemy course that does that. Are you a beginner to machine learning? If you are I would use Python for Data Science and Machine Learning Bootcamp by Jose Portilla

oblique socket Oct 25, 2020, 1:25 AM

#

Is there a better way to implement cross_validation_split using pandas and numpy?

    folds = []
    fold_length = df.index.size // num_folds
    shuffled = df.sample(frac=1)
    for i in range(num_folds):
        folds.append(shuffled.iloc[i*fold_length:(i+1)*fold_length])
    return folds```

cerulean spindle Oct 25, 2020, 1:34 AM

#

Have you tried the KFolds module in sklearn? I’m pretty sure you can do that automatically without a function in the cv parameter in cross_validate.

oblique socket Oct 25, 2020, 1:34 AM

#

I saw that, I wanted to try it without sklearn first

cerulean spindle Oct 25, 2020, 1:34 AM

#

Oh ok.

oblique socket Oct 25, 2020, 1:35 AM

#

I'll try that

velvet thorn Oct 25, 2020, 1:35 AM

#

you can just use len(df)

#

also, if you do it that way, your folds will all be the same size

oblique socket Oct 25, 2020, 1:35 AM

#

oh yeah, I wasn't sure if I could do that

#

or if it made a difference

velvet thorn Oct 25, 2020, 1:35 AM

#

which could omit rows if the number of rows you have is not perfectly divisible by the number of folds

#

other than that it looks more or less okay

#

space out your operators

oblique socket Oct 25, 2020, 1:37 AM

#

oh yeah

#

thanks

velvet thorn Oct 25, 2020, 1:38 AM

#

yw

oblique socket Oct 25, 2020, 1:45 AM

#

also, if you do it that way, your folds will all be the same size
@velvet thorn What do mean? Are they not the same size?

velvet thorn Oct 25, 2020, 1:45 AM

#

which could omit rows if the number of rows you have is not perfectly divisible by the number of folds
@velvet thorn see this

oblique socket Oct 25, 2020, 1:45 AM

#

yeah

agile wing Oct 25, 2020, 3:02 AM

#

just realized logistic regression default solver in scikit-learn uses l-bfgs solver instead of gradient descent

shell berry Oct 25, 2020, 3:57 AM

#

@hollow sentinel Thanks for the advice. I'm actually looking for some guidance on a particular project and I have specific questions

lapis sequoia Oct 25, 2020, 5:37 AM

#

Hey guys, I'm not seeing the bransches, do you guys know the issue?

dtree = dtree.fit(X_train,y_train)```

```plot_tree(dtree,
 filled = True,
 rounded = True,
 class_names = ['released', 'deceased'],
 feature_names = X.columns) ```

📎 Screenshot_2020-10-25_at_06.36.13.png

glacial rune Oct 25, 2020, 10:19 AM

#

not sure which topical chat / help is the best place for this sorry, but I'm a HTTP request to https://www.asos.com/api/product/catalogue/v3/stockprice?productIds=20510882&store=COM but on Python, the response I get is different to the one I get from my browser.

#

On my browser:

📎 unknown.png

#

On Python with requests library:

📎 unknown.png

#

any ideas what's causing this? This code ran fine a few weeks ago, but I tried it again today and it didn't work. Not sure if asos changed their backend

molten hamlet Oct 25, 2020, 12:58 PM

#

why it does not work?

#

your code is 200, so its fine

#

@glacial rune

glacial rune Oct 25, 2020, 1:02 PM

#

the response is different in python

molten hamlet Oct 25, 2020, 1:07 PM

#

what do you mean?

glacial rune Oct 25, 2020, 1:20 PM

#

the response body from Python is:

{"id":14014948,"name":"Nike Air Jordan 1 Mid trainers in colourblock","description":"<a href="/women/shoes/trainers/cat/?cid=6456"><strong>Trainers</strong></a> by   <a href="women/a-to-z-of-brands/jordan/cat/?cid=29517"><strong>Jordan</strong></a><ul>    <li><span style="background-color: initial;">Unboxing potential: considerable</span></li><li>Mid rise</li><li>Padded cuff for a supportive fit</li><li>Lace-up fastening&nbsp;</li><li>Nike Swoosh logo</li><li>Perforated toe cap for breathability</li><li>Helps keep them fresher for longer</li><li>Nike Air sole with Air units</li><li>Units contain pressurised air that compress on impact</li><li>For lightweight, durable cushioning</li><li>Rubber outsole</li></ul>","alternateNames":[{"locale":"en-GB","title":"Nike Air Jordan 1 Mid trainers in colourblock"},{"locale":"ru-RU","title":"Кроссовки средней высоты в стиле колор блок Nike Air Jordan 1"},{"locale":"sv-SE","title":"Nike – Air Jordan 1 – Blockfärgade träningsskor med halvhögt skaft"}],"localisedData":null,"gender":"Women","productCode":"1611119","pdpLayout":"Footwear",

#

I'm expecitng it to look more like my first screenshot, with the productID and prices

lapis sequoia Oct 25, 2020, 1:48 PM

#

data

summer holly Oct 25, 2020, 2:18 PM

#

Hi everyone, I'm trying to build flask backend that takes tweets as input, preprocesses and makes predictions. I want to use a keras model saved as h5 format. Can anyone direct me to any helpful resources on this? Thank you

hollow sentinel Oct 25, 2020, 4:08 PM

#

can someone explain why sklearn needs dummy columns

📎 unknown.png

lapis sequoia Oct 25, 2020, 4:13 PM

#

📎 unknown.png

#

Somebody help me with this... why does the normal DQN perform so much better than a lot of the other ones

sleek rampart Oct 25, 2020, 4:13 PM

#

I need help with Deep Neural Network, like OG pro

lapis sequoia Oct 25, 2020, 4:13 PM

#

Is it because DQN was faster to train in a simple environment?

sleek rampart Oct 25, 2020, 4:14 PM

#

what does DQN mean?@lapis sequoia does it mean double Q Network?

#

Deep Q Network I see, cool

hollow sentinel Oct 25, 2020, 4:19 PM

#

idek what that is

#

machine learning noob here

sleek rampart Oct 25, 2020, 4:23 PM

#

School Homework?^

hollow sentinel Oct 25, 2020, 4:38 PM

#

@sleek rampart are you talking about what I posted

heady hatch Oct 25, 2020, 4:43 PM

#

Hey @hollow sentinel , the reason for dummy variables is for dealing with categorical variables.

#

You could also use ordinal encoder but sometimes order doesn’t make sense, so you need dummies instead.

#

Dummy features also allow you to access multi class.

quick epoch Oct 25, 2020, 4:44 PM

#

Yo guys. Can you help me with the problem?

#

For some reason I am not getting multicoloured line

heady hatch Oct 25, 2020, 4:45 PM

#

I’m not familiar with visualization, but have you checked the documentations?

quick epoch Oct 25, 2020, 4:45 PM

#

📎 image0.jpg

hollow sentinel Oct 25, 2020, 4:45 PM

#

you need dummy columns to run a random forest/decision trees?

heady hatch Oct 25, 2020, 4:45 PM

#

You need dummy columns to deal with not numerical data.

quick epoch Oct 25, 2020, 4:45 PM

#

📎 image0.jpg

#

I am just getting a green curve

hollow sentinel Oct 25, 2020, 4:46 PM

#

@quick epoch not using jupyter notebook is a dangerous game haha

quick epoch Oct 25, 2020, 4:46 PM

#

I know. But I got all the necessary libraries and etc xd

#

So I am not worried about that XD

hollow sentinel Oct 25, 2020, 4:46 PM

#

@quick epoch i couldn't get pycharm to run on my machine properly lmao

dawn vault Oct 25, 2020, 4:47 PM

#

hey anyone familiar with dash library ?

quick epoch Oct 25, 2020, 4:47 PM

#

It’s easy actually. There is a way to install all necessary packages automatically using conda

#

But I am not good at data visualisation but I need this for data science and ml

#

So if you could help me I would appreciate it

hollow sentinel Oct 25, 2020, 4:48 PM

#

https://stackoverflow.com/questions/62020588/seaborn-multi-line-plot-with-only-one-line-colored

Stack Overflow

Seaborn multi line plot with only one line colored

I am trying to plot a multi line plot using sns but only keeping the US line in red while the other countries are in grey

This is what I have so far:

df = px.data.gapminder()
sns.lineplot(x = 'ye...

#

maybe that's it idk

#

I've been using seaborn the most bc that's what I've been doing with Udemy

quick epoch Oct 25, 2020, 4:49 PM

#

There are multiple plots and I need one

#

That changes

hollow sentinel Oct 25, 2020, 4:50 PM

#

you need one but you want certain parts of the line to be different colors

#

what graphing library are you using

quick epoch Oct 25, 2020, 4:50 PM

#

Matplotlib

#

What I mean is

#

📎 image0.jpg

#

Do you see these if statements?

hollow sentinel Oct 25, 2020, 4:50 PM

#

https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/multicolored_line.html

quick epoch Oct 25, 2020, 4:51 PM

#

I want the plot to change whenever it sees those values in the csv file

hollow sentinel Oct 25, 2020, 4:51 PM

#

i get it the doc for multi colored line might help

#

https://scipy-cookbook.readthedocs.io/items/Matplotlib_MulticoloredLine.html

dawn vault Oct 25, 2020, 4:53 PM

#

@hollow sentinel are you familiar with dash ? and datatables and callbacks.. ? if so you could mightg help me out ? thnx

hollow sentinel Oct 25, 2020, 4:53 PM

#

@dawn vault hahhahaha I've only been doing machine learning for 2 or 3 weeks

#

is dash a graphing library like plotly?

dawn vault Oct 25, 2020, 4:55 PM

#

sort of.. one can build dashboards quit easily.. and yea dash and plotly are mmore or less the same thing..

#

with dash one can build interactive sahboards using plotly and stuff

#

*dashboards

hollow sentinel Oct 25, 2020, 4:56 PM

#

yeah so far I only know pandas, matplotlib, seaborn, and plotly rn

dawn vault Oct 25, 2020, 4:56 PM

#

kk

#

iam working with pandas and dahs/plotly.. to get my project going .. but running into issues..

hollow sentinel Oct 25, 2020, 4:57 PM

#

F

#

don't worry you'll get help here

dawn vault Oct 25, 2020, 4:59 PM

#

so what are you working on right now ?

hollow sentinel Oct 25, 2020, 4:59 PM

#

support vector machines

dawn vault Oct 25, 2020, 4:59 PM

#

omg .. dont know any of those words... lol

hollow sentinel Oct 25, 2020, 4:59 PM

#

hahahhahahah me neither

#

this udemy course is killing me with all this new info

dawn vault Oct 25, 2020, 5:00 PM

#

which one?

hollow sentinel Oct 25, 2020, 5:00 PM

#

Python for Data Science and Machine Learning Bootcamp by Jose Portilla

#

I'd recommend it for people who want to start learning machine learning it's only 20 bucks

dawn vault Oct 25, 2020, 5:01 PM

#

jose portilla rings a bell.. i might have taken a course.. from him.. w

hollow sentinel Oct 25, 2020, 5:01 PM

#

yeah he's great

#

probably one of my favorite people to learn from

dawn vault Oct 25, 2020, 5:03 PM

#

took python for financial analysis and algo trading...