#data-science-and-ml | Python | Page 414

serene scaffold Jun 26, 2022, 1:40 AM

#

polars is written in rust BingShrug

#

so much for three hours

glad mulch Jun 26, 2022, 1:41 AM

#

for some reason, my where function is bugging out. Example code:

#

def generate_pnl(df: pd.DataFrame, gain, gain_std, loss, loss_std):
    for col in df.columns.values:
        start_time = time.time()
        print("Generating PNL values for portfolio {}".format(col))
        print(df[col].head())
        df[col] = df[col].where(
            (df[col] == 1),
            truncnorm.rvs(
                1, gain+gain_std, loc=gain, scale=gain_std, size=len(df)))
        print(df[col].head())
        df[col] = df[col].where(
            (df[col] == 0), -truncnorm.rvs(
                1, loss+loss_std, loc=loss, scale=loss_std, size=len(df)))
        print(df[col].head())
        end_time = time.time()
        duration = end_time - start_time
        print(
            f"Completed generating PNL values for portfolio {col} in {duration} seconds")
    return df

#

i did a quick print function at each point of the statement

#

# First print Statement
1    0
2    1
3    0
4    0
5    0
# Second print statement
1    243.949907
2      1.000000
3    208.573045
4    279.292684
5    202.035304
# Third Print Statement 
1   -241.167932
2   -251.109101
3   -265.073210
4   -202.495864
5   -282.503205

serene scaffold Jun 26, 2022, 1:42 AM

#

it looks like each of these might be a Series rather than a df

glad mulch Jun 26, 2022, 1:43 AM

#

so i have a dataframe and iterate through each column

burnt citrus Jun 26, 2022, 2:09 AM

#

serene scaffold so much for three hours

no im done just got to pkg it

#

clone the repo and run the main.py

#

didn't need to c it

#

i did it how r does it

#

by done i mean the logic is all there

#

so it works as

#

the header of the cols will be the name of the array

#

and then you can mutate the array

#

like any other array

#

then write to file like that

#

have to finish that part

burnt citrus Jun 26, 2022, 3:30 AM

#

I am semi finished now

#

act looked at the pandas functions

#

its a little fucked

#

but i have just lost focus after my mom called so i will pick this up later

#

so much for 3 hrs

viral swan Jun 26, 2022, 4:44 AM

#

anyone can help how can I extract the data from this table like regex pattern?

hoary wigeon Jun 26, 2022, 6:32 AM

#

I need help with replacing null values

Im trying to fill industry values with mode of industry and bucket of titles,

lead_df['industry'] = lead_df.groupby('title')['industry'].transform(lambda x: x.fillna(x.value_counts().mode()[0]))```

There are missing values for those combinations too.. and im getting `ERROR`

Is there any alternate way to do that?

agile cobalt Jun 26, 2022, 6:33 AM

#

which error are you getting?

hoary wigeon Jun 26, 2022, 6:55 AM

#

agile cobalt which error are you getting?

no values present in series KeyError: 0

tacit basin Jun 26, 2022, 7:15 AM

#

viral swan anyone can help how can I extract the data from this table like regex pattern?

splitting on \ would be a good start 🙂

viral swan Jun 26, 2022, 7:16 AM

#

yeah, im doing step by step

#

could not figure out how to apply a general rule

#

agile cobalt Jun 26, 2022, 7:23 AM

#

hoary wigeon no values present in series `KeyError: 0`

maybe try .iloc[0] instead of [0] then

untold bloom Jun 26, 2022, 7:26 AM

#

hoary wigeon I need help with replacing null values Im trying to fill industry values with m...

you're trying to fill the NaNs with the mode of counts rather than the mode of values themselves; is that intended? i.e., perhaps you should be doing x.mode()[0] instead.

#

but this isn't the source of error: you probably have a "title" for which the all "industry" values are NaN, hence there's no value to take the mode of because mode (or value_counts) won't consider NaN by default

#

so perhaps try x.mode(dropna=False)[0] to see if the error goes away. If so, you need to do something for those groups :p

#

also .iat[0] is slightly more clear than [0] here although they achieve the same in this specific case because what mode returns has a RangeIndex.

steady basalt Jun 26, 2022, 10:00 AM

#

Another day another QUIZ! Who wants to do some Einstein notation! Multiplication…

#

Aka index notation

#

Need some rotations and reflections

wooden sail Jun 26, 2022, 10:03 AM

#

einstein notation does the soul good

steady basalt Jun 26, 2022, 10:05 AM

#

Didn’t it come from just

#

Laziness

#

My lord they want me to do it in numpy again. CBA

wooden sail Jun 26, 2022, 10:06 AM

#

it's not from laziness, it's just an alternative notation

#

a very powerful one at that

#

and yeah, numpy has an einsum function

#

that makes it so that your math involving multilinear transformations looks exactly the same on paper as in your code (which otherwise isn't the case, since you have to unfold tensors into matrices more or less arbitrarily)

steady basalt Jun 26, 2022, 10:09 AM

#

changing basis oh no

steady basalt Jun 26, 2022, 10:09 AM

#

wooden sail and yeah, numpy has an einsum function

not allowed functions

#

has to be literally line by line

wooden sail Jun 26, 2022, 10:10 AM

#

well, in that case, for you, it makes no difference which notation you use

steady basalt Jun 26, 2022, 10:10 AM

#

now its getting confusing af

#

objects basis vector + objects vector

#

translating to 3d

#

that makes 0 sense

wooden sail Jun 26, 2022, 10:11 AM

#

idk what you mean by objects vector

steady basalt Jun 26, 2022, 10:11 AM

#

yean i have no idea

#

whats going on anymore

wooden sail Jun 26, 2022, 10:11 AM

#

what language are you learning the math in?

steady basalt Jun 26, 2022, 10:12 AM

#

ENglish

wooden sail Jun 26, 2022, 10:12 AM

#

but since last time you're not using any standard math terms

#

where are you getting these terms from

steady basalt Jun 26, 2022, 10:12 AM

#

the guy said it

#

the teacher

wooden sail Jun 26, 2022, 10:12 AM

#

oof

steady basalt Jun 26, 2022, 10:12 AM

#

he said we have a object

#

and hes said it has a vector

#

wooden sail Jun 26, 2022, 10:13 AM

#

is that exactly what they said? are we talking object in python or object in the real world?

steady basalt Jun 26, 2022, 10:13 AM

#

wooden sail Jun 26, 2022, 10:13 AM

#

is this from school or are you watching videos from random people on youtube?

steady basalt Jun 26, 2022, 10:13 AM

#

its coursera

wooden sail Jun 26, 2022, 10:14 AM

#

man, that's terrible

steady basalt Jun 26, 2022, 10:14 AM

#

its 40 euros a month

wooden sail Jun 26, 2022, 10:14 AM

#

at any rate, what they mean is that the object has a "position vector"

steady basalt Jun 26, 2022, 10:14 AM

#

im fu cking lost bro

#

ive had to cheat using a matrix calculator in the last quiz

wooden sail Jun 26, 2022, 10:15 AM

#

that course looks really bad from what i can see

steady basalt Jun 26, 2022, 10:15 AM

#

it was projecting shadows off of 3d objects and then doing some weird transofmations

#

cant wait to just get thru the course put it on the cv and th en go learn from somewhere more explainable

#

i can DM u the video lmao

wooden sail Jun 26, 2022, 10:16 AM

#

i would discourage you from rushing through it just to put in in your cv because this is all super elementary. you'll have to basically start again from 0 anayway

steady basalt Jun 26, 2022, 10:22 AM

#

my brain turns off at the 3rd minute

#

ngl

wooden sail Jun 26, 2022, 10:24 AM

#

ok, so

#

i personally don't like 3b1b, and especially not his linear algebra series

#

but many people seem to like it

steady basalt Jun 26, 2022, 10:24 AM

#

ive watched a few of his videos

wooden sail Jun 26, 2022, 10:24 AM

#

so one place to look at is here https://www.youtube.com/watch?v=P2LTAUO1TdA

YouTube

3Blue1Brown

Change of basis | Chapter 13, Essence of linear algebra

How do you translate back and forth between coordinate systems that use different basis vectors?
Help fund future projects: https://www.patreon.com/3blue1brown
An equally valuable form of support is to simply share some of the videos.
Home page: https://www.3blue1brown.com/

Future series like this are funded by the community, through Patreon, w...

▶ Play video

steady basalt Jun 26, 2022, 10:25 AM

#

what do you not like about his videos

#

he explains better than my course imo

wooden sail Jun 26, 2022, 10:25 AM

#

even though he WANTS them to be, his videos are NOT for people who're freshly learning a topic. his explanations only work if you've already learned the topic before and want a different perspective

#

in my opinion, at any rate

primal shuttle Jun 26, 2022, 10:27 AM

#

Agreed - you can't learn from these

wooden sail Jun 26, 2022, 10:27 AM

#

i'd also just recommend to look at gilbert strang's linear algebra book

steady basalt Jun 26, 2022, 10:27 AM

#

wassp, u shudda seen my 40 euro a month course

#

cant learn shit is wear

primal shuttle Jun 26, 2022, 10:27 AM

#

That's why content validation is crucial when picking study materials 🙂

#

I'll be making a video on this on my YT channel if you're interested

steady basalt Jun 26, 2022, 10:27 AM

#

what is ur channel

primal shuttle Jun 26, 2022, 10:28 AM

#

It'll start next week

wooden sail Jun 26, 2022, 10:28 AM

#

online courses don't work for everyone. some teachers are bad at making them, and some students are bad at learning from them. the same is true for all learning and teaching material too, so you should always keep an open mind, look for different kinds of material, etc

primal shuttle Jun 26, 2022, 10:28 AM

#

well after next week

steady basalt Jun 26, 2022, 10:28 AM

#

im a student whos slow to catch on

#

like... very slow

wooden sail Jun 26, 2022, 10:29 AM

#

i will also point out that even going to lectures at uni is not meant to teach you everything. lectures only work if you complement the material with your own studies

#

so extra material is ALWAYS needed

#

you need something to read

steady basalt Jun 26, 2022, 10:29 AM

#

is intro to linalg a good book

#

does it cover all of the stuff ive been doing

primal shuttle Jun 26, 2022, 10:30 AM

#

There is also this thing of going from theoretical to practical or the other way round, you need to work out what works best for you

wooden sail Jun 26, 2022, 10:30 AM

#

steady basalt is intro to linalg a good book

yep

primal shuttle Jun 26, 2022, 10:30 AM

#

@steady basalt find the book, see its contents, read an exemplary chapter on stuff you are currently learning

wooden sail Jun 26, 2022, 10:30 AM

#

the MIT one by gil strang?

primal shuttle Jun 26, 2022, 10:30 AM

#

See if it speaks to you

steady basalt Jun 26, 2022, 10:31 AM

#

strang yes

primal shuttle Jun 26, 2022, 10:31 AM

#

If it doesn't, find another source, rinse and repeat

steady basalt Jun 26, 2022, 10:31 AM

#

123 pounds are u serious

wooden sail Jun 26, 2022, 10:31 AM

#

look around here first https://math.mit.edu/~gs/linearalgebra/

#

to see if you find the style pleasant

barren wedge Jun 26, 2022, 10:33 AM

#

Any idea on how to generate infographic using AI?

steady basalt Jun 26, 2022, 10:34 AM

#

Not paying so much

#

Will use library

wooden sail Jun 26, 2022, 10:39 AM

#

the short explanation i can give you is based on the interpretation of the multiplication of matrices and vectors

#

it is usually helpful to think of a matrix as a collection of vectors. let's say you have a 3x3 matrix M. then you can think of M as [m_1 m_2 m_3], where each of the m_i are a vector in R^3

#

now, if you consider a vector v = [x,y,z]

#

the product M v is equal to x m_1 + y m_2 + z m_3

#

we observe that this is the very definition of a linear combination

#

if we write w = Mv, what this says is "the vector w is written as a linear combination of vectors. the vectors are the columns of M, and the coefficients are the entries of v"

steady basalt Jun 26, 2022, 10:42 AM

#

I can do AijBjn or something

wooden sail Jun 26, 2022, 10:43 AM

#

now we just shift out viewpoint a little, and refer to v as a "coordinate vector". the entries of v are coordinates in some basis. that basis is formed by the columns of M.

#

and so w = Mv can be interpreted as "the vector w has coordinates v in the basis formed by the columns of M"

#

and now, to get the final bit, maybe we are not given v. maybe we are given w instead, and we want to FIND v. then w = Mv -> M^-1 w = v

#

so M^-1 is a change of basis transformation

steady basalt Jun 26, 2022, 10:45 AM

#

This change in basis can’t it just be explained as basic product of matrix and vector

#

Oh

wooden sail Jun 26, 2022, 10:45 AM

#

that's basically the idea i tried to convey just now, yes

steady basalt Jun 26, 2022, 10:45 AM

#

The basis is the matrix that multiples a vector

#

I thought the basis was the axis

wooden sail Jun 26, 2022, 10:45 AM

#

whenever you have a matrix vector product, you can choose to interpret it as expressing a vector in a special basis

steady basalt Jun 26, 2022, 10:45 AM

#

Same thing? Axis stretched?

wooden sail Jun 26, 2022, 10:46 AM

#

basis is not "the" axis because an n-dimensional space has n axes

#

the basis is all of the axes

steady basalt Jun 26, 2022, 10:46 AM

#

The basis of the vector

wooden sail Jun 26, 2022, 10:46 AM

#

the basis has the name number of elements in it as the dimensionality of the subspace containing the vector

#

in the panda example, the dude is working on a 2D plane

#

that means the basis has 2 vectors

steady basalt Jun 26, 2022, 10:47 AM

#

Ok

wooden sail Jun 26, 2022, 10:47 AM

#

and you want to use these 2 vectors to express any point in 2D space

steady basalt Jun 26, 2022, 10:47 AM

#

But a panda has many points it isn’t a square

wooden sail Jun 26, 2022, 10:47 AM

#

yeah well, the guy explained stuff really poorly

#

after watching the vid, what he means is

#

"consider a random point in 2D space"

steady basalt Jun 26, 2022, 10:48 AM

#

It’s just the uhh dimension direction?

wooden sail Jun 26, 2022, 10:48 AM

#

"to the panda, it seems the point has these coordinates. to me, though, it looks like the SAME point has DIFFERENT coordinates, because i'm looking at it from a different point of view"

steady basalt Jun 26, 2022, 10:48 AM

#

2d vs 3d u mean?

wooden sail Jun 26, 2022, 10:48 AM

#

no

#

both in 2D

#

that's the whole point

steady basalt Jun 26, 2022, 10:49 AM

#

But panda in in 2d

#

And you are 3d

#

Both looking at 2d

wooden sail Jun 26, 2022, 10:49 AM

#

the panda is an observer looking at the 2D plane

#

you are an observer looking at the same 2D plane

#

you're both looking at the same point

steady basalt Jun 26, 2022, 10:49 AM

#

So the panda is the same as me from a different angle?

wooden sail Jun 26, 2022, 10:49 AM

#

sure

steady basalt Jun 26, 2022, 10:49 AM

#

It’s not inside the computer

#

In the graph space

wooden sail Jun 26, 2022, 10:50 AM

#

no

steady basalt Jun 26, 2022, 10:50 AM

#

Oh

wooden sail Jun 26, 2022, 10:50 AM

#

mind you, it could be, but then the transformation does not involve square matrices with inverses, but rather rectangular matrices that are either left or right-invertible

#

and i'm pretty sure you haven't gotten there yet in your content 😛

steady basalt Jun 26, 2022, 10:50 AM

#

Wait a minute

wooden sail Jun 26, 2022, 10:50 AM

#

what you're saying CAN be done, you'll learn it later

#

but for now assume just that you're both the same and looking at a 2D plane

steady basalt Jun 26, 2022, 10:51 AM

#

Me and panda both see the point in space at the same place but we have different axis values because we both start at zero?

wooden sail Jun 26, 2022, 10:51 AM

#

no

steady basalt Jun 26, 2022, 10:51 AM

#

so it appears different

wooden sail Jun 26, 2022, 10:51 AM

#

rather, the point where you placed the 0 might be different. or maybe you're looking at it from a skewed angle

steady basalt Jun 26, 2022, 10:51 AM

#

thats what i mean

wooden sail Jun 26, 2022, 10:51 AM

#

or more generally, the only condition for a set of vectors to form a basis is for them to be linearly independent

steady basalt Jun 26, 2022, 10:52 AM

#

its physically in the same place but appears different so thats confusing me

#

this sort of makes 'physically' not a thing anymore

wooden sail Jun 26, 2022, 10:52 AM

#

but you have that experience every day

steady basalt Jun 26, 2022, 10:52 AM

#

because its literally not in the same place mathematically

wooden sail Jun 26, 2022, 10:52 AM

#

you point at a thing that's far away and tell your friend to look at it

#

and he struggles to find the thing

#

he sees you pointing at it, but his eyes are not located where yours are

steady basalt Jun 26, 2022, 10:52 AM

#

isnt the location of a point in space defined by your axis, so the same point is different from another angle

wooden sail Jun 26, 2022, 10:53 AM

#

so he can't see where exactly you'Re pointing

steady basalt Jun 26, 2022, 10:53 AM

#

that seems to break physical laws its so out there

wooden sail Jun 26, 2022, 10:53 AM

#

idk what you're even trying to say

#

absolute space location is not a thing

steady basalt Jun 26, 2022, 10:53 AM

#

cuz physically its in the same place

wooden sail Jun 26, 2022, 10:53 AM

#

sure

#

but how do you describe where "the same place" is?

steady basalt Jun 26, 2022, 10:54 AM

#

its easier to imagine in 3d

#

with like a object sitting stationary in space

wooden sail Jun 26, 2022, 10:54 AM

#

there is nothing that makes one coordinate system more valid than another

steady basalt Jun 26, 2022, 10:54 AM

#

two people observe that object

#

its in the same literal space

wooden sail Jun 26, 2022, 10:54 AM

#

this is exactly the example i just gave you

steady basalt Jun 26, 2022, 10:54 AM

#

but they both have their co-ord system catered towards their pov

wooden sail Jun 26, 2022, 10:54 AM

#

idk if you're just ignoring what i'm writing

steady basalt Jun 26, 2022, 10:55 AM

#

so mathematically they have a different point infront of them but its the same point

#

is that waht u mean

wooden sail Jun 26, 2022, 10:55 AM

#

idk what you mean by "mathematically" there

steady basalt Jun 26, 2022, 10:55 AM

#

in terms of their own description of its location

wooden sail Jun 26, 2022, 10:55 AM

#

the point is the same, you can choose whatever coordinate system you like to describe it

steady basalt Jun 26, 2022, 10:55 AM

#

one guy can say its at x co-ords and the other says y

#

is that what the panda is about

wooden sail Jun 26, 2022, 10:55 AM

#

yes

steady basalt Jun 26, 2022, 10:56 AM

#

okay

wooden sail Jun 26, 2022, 10:56 AM

#

but you're trying to give it some extra physical meaning that also doesn't exist lol

#

there's no such thing as absolute coordinates anyway

steady basalt Jun 26, 2022, 10:56 AM

#

its all perspective?

wooden sail Jun 26, 2022, 10:56 AM

#

sure

#

all the maps you use in real life follow a convention someone made up

#

there's no reason why they're "more correct"

steady basalt Jun 26, 2022, 10:57 AM

#

yeah so this problem is all about describing a point in another co-ord system?

wooden sail Jun 26, 2022, 10:57 AM

#

yeah

steady basalt Jun 26, 2022, 10:57 AM

#

but we need to know that systems axis

#

values

wooden sail Jun 26, 2022, 10:57 AM

#

right

steady basalt Jun 26, 2022, 10:57 AM

#

is that where the basis vectors come in

wooden sail Jun 26, 2022, 10:57 AM

#

yeah

steady basalt Jun 26, 2022, 10:58 AM

#

those are the vectors whicih point towards the object

#

?

wooden sail Jun 26, 2022, 10:58 AM

#

there IS one assumption made, and it's that the coordinate systems share the same origin

#

yeah

#

so you have a point p in 2D space

steady basalt Jun 26, 2022, 10:58 AM

#

so we have the basis vector not of the panda but of the object from the pandas pov? why did he say of the panda

wooden sail Jun 26, 2022, 10:58 AM

#

you could write p as ax + by, or you could write it as wu + vz

steady basalt Jun 26, 2022, 10:58 AM

#

was he talking about a random point

wooden sail Jun 26, 2022, 10:59 AM

#

yeah, he meant to say of the panda's POV

steady basalt Jun 26, 2022, 10:59 AM

#

same origin?

wooden sail Jun 26, 2022, 10:59 AM

#

so, vectors don't inherently have a location in space

#

so "canonically" (i.e. someone made up the convention and we follow it), vectors are assumed to have their tail and some origin, which we usually call (0,0) in the canonical basis formed by the nice and simple vectors [1,0] and [0,1]

steady basalt Jun 26, 2022, 11:00 AM

#

but i thought that the origin depended on the pov

#

the co-ord system

#

now im confused

#

were stil talking 2d

wooden sail Jun 26, 2022, 11:00 AM

#

it can, and these receive the name of affine transformations

#

you'll also learn that later

#

but for now assume they have the same origin, but are maybe slanted or stretched

steady basalt Jun 26, 2022, 11:01 AM

#

if both basis vectors come out of the same origin, how is it a different pov

wooden sail Jun 26, 2022, 11:01 AM

#

because of the slant or stretch

#

for example

#

consider the basis [1,0], [0,1]

steady basalt Jun 26, 2022, 11:01 AM

#

but its the same co-ord grid aka same pov

wooden sail Jun 26, 2022, 11:01 AM

#

now consider the new basis [3,0], [0,1]

#

if we have the point that, in the canonical basis, has coordinates [3,1]

#

in the new basis, this vector has coordinates [1,1]

steady basalt Jun 26, 2022, 11:02 AM

#

i need a minute to envision that

wooden sail Jun 26, 2022, 11:02 AM

#

because the new basis has a longer vector to explain the horizontal axis

#

this is the same as, for example, giving THE SAME LENGTH in km vs in miles

#

but more generally you can also have a slant, instead of just a stretch

steady basalt Jun 26, 2022, 11:03 AM

#

ok so the co-ords are translated to whatever the basis units are

#

why is it even called a basis in the first place

#

cant values go less than a basis unit

wooden sail Jun 26, 2022, 11:04 AM

#

because you explain every point in space as being made up from the elements of the basis

#

the whole space is "based" on them

wooden sail Jun 26, 2022, 11:04 AM

#

steady basalt cant values go less than a basis unit

and yes, they can

steady basalt Jun 26, 2022, 11:04 AM

#

but you can still get 0.5 on a 1,1 basis

wooden sail Jun 26, 2022, 11:04 AM

#

sure

steady basalt Jun 26, 2022, 11:05 AM

#

so the grid squares in 1,0 0,1 basis are squares but in the 3,1 are rectangles?

wooden sail Jun 26, 2022, 11:05 AM

#

all of them are equivalent to each other in some sense. the whole idea is exactly that

steady basalt Jun 26, 2022, 11:05 AM

#

this is getting really hard to envision now

wooden sail Jun 26, 2022, 11:05 AM

#

that it doesn't matter what basis you use

#

they all do the same job

steady basalt Jun 26, 2022, 11:05 AM

#

how did u manage to get this to sink in in the first place

#

this is purely based on 3rd eye strength

primal shuttle Jun 26, 2022, 11:06 AM

#

Practice

steady basalt Jun 26, 2022, 11:06 AM

#

xd

wooden sail Jun 26, 2022, 11:06 AM

#

practice is one thing, since it helps develop intuition

#

but also, algebra is very powerful independently of visualization

#

the simpler idea is kinda like this

#

imagine i tell you "we have this number 5 here "

steady basalt Jun 26, 2022, 11:07 AM

#

3b1b has lost me

wooden sail Jun 26, 2022, 11:07 AM

#

"what 2 numbers did we add in order to get 5?"

#

and i tell you nothing else

#

you quickly realize this question has infinitely many answers

#

5 = 0 + 5, but also 1 + 4, and also -0.99999 + 5.99999

steady basalt Jun 26, 2022, 11:08 AM

#

i saw on some news show that a woman said 2+2 may not actually = 4

wooden sail Jun 26, 2022, 11:08 AM

#

...

steady basalt Jun 26, 2022, 11:09 AM

#

something about math being racist and the way we understand it is subjective

#

american*

primal shuttle Jun 26, 2022, 11:09 AM

#

……

steady basalt Jun 26, 2022, 11:09 AM

#

according to her, it cud be another system entirely

serene scaffold Jun 26, 2022, 11:09 AM

#

steady basalt i saw on some news show that a woman said 2+2 may not actually = 4

this was probably a metaphor for something unrelated to math that fell flat.

wooden sail Jun 26, 2022, 11:09 AM

#

that essentially made the remainder of my patience evaporate. best of luck with learning change of bases!

steady basalt Jun 26, 2022, 11:09 AM

#

serene scaffold this was probably a metaphor for something unrelated to math that fell flat.

it was a talk on maths

steady basalt Jun 26, 2022, 11:10 AM

#

wooden sail that essentially made the remainder of my patience evaporate. best of luck with ...

im starting to understand now

#

when you inverse matrix it takes the old basis back

wooden sail Jun 26, 2022, 11:13 AM

#

my final attempt will be algebraic

#

consider again the equation w = Mv

#

more explicitly, we can now write I w = M v, where I is an appropriately sized identity matrix

#

we say that v = M^-1 I w is a vector in the basis M, because we need to multiply it by M again to return to a vector that is a linear combination of the canonical basis vectors

#

we can see this by taking Mv = M M^-1 I w = I^2 w = I w = w, without any further dependence on M

steady basalt Jun 26, 2022, 11:16 AM

#

cause I doesnt do anything ?

wooden sail Jun 26, 2022, 11:16 AM

#

without any geometric interpretation, this holds in arbitrarily may dimensios

#

right

steady basalt Jun 26, 2022, 11:17 AM

#

what happens when you actually show this on a graph, so far ive only seen co ordinate vectors

wooden sail Jun 26, 2022, 11:17 AM

#

that's what the 3b1b video shows

steady basalt Jun 26, 2022, 11:17 AM

#

you cant show a 3x3 matrix can u

#

or can you

#

is it a cube?

wooden sail Jun 26, 2022, 11:18 AM

#

but anyway as soon as you move away from R^1, 2, and 3, there is no longer any good visualization

#

a 3x3 matrix is an object in 9 dimensional space

steady basalt Jun 26, 2022, 11:18 AM

#

111 111 111 is a cube?

#

oh

wooden sail Jun 26, 2022, 11:18 AM

#

the matrix is not in the space the vectors are in. it's a function that acts on those vectors

steady basalt Jun 26, 2022, 11:18 AM

#

so im never gona be able to see actual matrices

#

just vectors

wooden sail Jun 26, 2022, 11:19 AM

#

but you CAN look at the columns as vectors in that same space

steady basalt Jun 26, 2022, 11:19 AM

#

wooden sail a 3x3 matrix is an object in 9 dimensional space

9D object?

#

and 3x1 is 3d?

wooden sail Jun 26, 2022, 11:19 AM

#

yes

steady basalt Jun 26, 2022, 11:19 AM

#

1x3 also 3d

wooden sail Jun 26, 2022, 11:20 AM

#

yes

steady basalt Jun 26, 2022, 11:20 AM

#

2x2 is 4d?

wooden sail Jun 26, 2022, 11:20 AM

#

yes

steady basalt Jun 26, 2022, 11:20 AM

#

i wish i cud see it

wooden sail Jun 26, 2022, 11:21 AM

#

you can't, and the idea if linear algebra is precisely that

#

take a nice and easy behavior that is easy to visualize in low dimensions

#

and now generalize it to arbitrarily weird structures that satisfy the same conditions

steady basalt Jun 26, 2022, 11:21 AM

#

as I saw earlier you can translate from 3d to 2d, so cant u go from 4d to 3d to 2d

wooden sail Jun 26, 2022, 11:22 AM

#

it can be done, just not uniquely

#

3d to 2d already can't be done uniquely

steady basalt Jun 26, 2022, 11:22 AM

#

it was a shadow cast of an object in my quiz

wooden sail Jun 26, 2022, 11:23 AM

#

probably orthogonal projections

#

but anyway, yes, you can go from 4d to 2d

#

you just can't visualize it

steady basalt Jun 26, 2022, 11:24 AM

#

so you can not see the 2d version?

wooden sail Jun 26, 2022, 11:24 AM

#

you can see the 2d shadow, not the original 4d thing

#

and the shadow can be formed in infinitely many ways. you just saw one in your course

#

whenever you see something has the name "algebra" in it, you have to immediately be prepared to have no direct visualization. you can almost always construct illustrative examples that are simple, like working with 1, 2, and 3d space. but the point is to take that intuition, generalize it, and now be able to do similar things in more abstract scenarios

#

nothing stops you from projecting something in 1000d space down to 100d space, how do you visualize it? that's a different matter altogether

steady basalt Jun 26, 2022, 11:27 AM

#

i wonder if we will ever get a new breakthrough scientist who changes that

wooden sail Jun 26, 2022, 11:27 AM

#

changes what?

steady basalt Jun 26, 2022, 11:27 AM

#

rules of dimensions

wooden sail Jun 26, 2022, 11:27 AM

#

what?

steady basalt Jun 26, 2022, 11:27 AM

#

well consider how far we came in 100 years now imagine 100 years form now

#

they might change maths even more

#

unless we suddnely got less productive

wooden sail Jun 26, 2022, 11:28 AM

#

the changes are made by building on top, the results used are already proven to be true

#

idk what you even mean

steady basalt Jun 26, 2022, 11:28 AM

#

basically cant even imagine what the next einstein will do...

wooden sail Jun 26, 2022, 11:29 AM

#

you should start by looking at the 2x2 matrix in front of you

steady basalt Jun 26, 2022, 11:30 AM

#

im just doing this to be able to do a job, not make a new discovery

#

maybe there will never be a next einstein

wooden sail Jun 26, 2022, 11:31 AM

#

maybe the langlands program yields something cool in a few years time

steady basalt Jun 26, 2022, 11:31 AM

#

do you think our reliance on computers has made that a problem

#

what is langlands

wooden sail Jun 26, 2022, 11:31 AM

#

has made what

#

dude holy crap

#

focus on learning

steady basalt Jun 26, 2022, 11:31 AM

#

well think about it, 60 years ago people had alot more time on their hands to put into thinking

#

https://en.wikipedia.org/wiki/Langlands_program this?

Langlands program

In representation theory and algebraic number theory, the Langlands program is a web of far-reaching and influential conjectures about connections between number theory and geometry. Proposed by Robert Langlands (1967, 1970), it seeks to relate Galois groups in algebraic number theory to automorphic forms and representation theory of algebraic g...

#

never did i think there was a field dedicated to studying sound waves

#

how itneresting

#

#

how can a^2 + b^2 = c^2 work of cubing doesnt work

wheat snow Jun 26, 2022, 12:23 PM

#

@untold bloom I was thinking of a way to plot (in a specific period of time (e.g. 4months)) the average hours watched per month... i was thinking about using .mean() somehow

#

here... with that i get the average of the whole time duration

result=df_vd_R.groupby(df_vd_R["Start Time"].dt.date)["Duration"].sum()
result.index = pd.to_datetime(result.index)
b=(result.loc["2019-03-24": "2019-5-24"].dt.total_seconds()/60/60)

Month= b.mean()
print(Month)

#

but now, i want that we have e.g. "2019-03-24": "2019-06-25" and get 4 values (each an average of the month (y axis) and x axis would be the 4 moths

flat sable Jun 26, 2022, 1:25 PM

#

Ah im searching for sm1 have a training while im still learning python libraries for Ai

steel flax Jun 26, 2022, 2:36 PM

#

I've a question about sklearn.random_projection.johnson_lindenstrauss_min_dim, they are using this formula: n_components >= 4 log(n_samples) / (eps^2 / 2 - eps^3 / 3), but what is the origin of this formula? Wiki page doesn't mention it, the only place where I could find it, is in the sklearn's soruce code... any ideas?

wooden sail Jun 26, 2022, 3:12 PM

#

if you read here, they show a few references. https://scikit-learn.org/stable/modules/random_projection.html#johnson-lindenstrauss i can't find the exact derivation, though

scikit-learn

6.6. Random Projection

The sklearn.random_projection module implements a simple and computationally efficient way to reduce the dimensionality of the data by trading a controlled amount of accuracy (as additional varianc...

#

i'm under the impression it's a heuristic to generously try to guarantee a value of epsilon in the jonhson lindenstrauss lemma

#

ah, it seems the result is from this paper https://www.sciencedirect.com/science/article/pii/S0022000003000254

Database-friendly random projections: Johnson-Lindenstrauss with bi...

A classic result of Johnson and Lindenstrauss asserts that any set of n points in d-dimensional Euclidean space can be embedded into k-dimensional Euc…

#

#

that's the one

#

the proof seems rather involved

#

that's a pretty well-cited paper, too

#

might as well do a little @steel flax to make sure you find the message later

mental girder Jun 26, 2022, 3:49 PM

#

I'm trying to make a program to take raw data from a Gaussian text file and export it to excel and was directed towards Jupyter/Anaconda since it has support for pandas which can do that. What are the differences between a Jupyter notebook and a regular Python file?

wooden sail Jun 26, 2022, 3:50 PM

#

none

#

it makes no difference for your application

mental girder Jun 26, 2022, 3:50 PM

#

wooden sail none

Is each block like a separate program?

wooden sail Jun 26, 2022, 3:51 PM

#

yes, you can run each block separately

#

if that's helpful for you, or you want to include text/tex in blocks interleaved with the code, it's nice

#

but otherwise it's no different. you can think of it almost like a fancy IDE

mental girder Jun 26, 2022, 3:52 PM

#

huh

wooden sail Jun 26, 2022, 3:52 PM

#

it doesn't change which packages you can use

mental girder Jun 26, 2022, 3:53 PM

#

It just seems strange to look at something like this after only using regular guis

wooden sail Jun 26, 2022, 3:53 PM

#

you don't have to use it if you don't like it 😛

#

anaconda is nice for package and environment management, but it also isn't necessary

mental girder Jun 26, 2022, 3:55 PM

#

i just downloaded anaconda since my pip wasnt working after installing python

ancient pendant Jun 26, 2022, 3:55 PM

#

Hi I am a begineer and need some help here
I am doing one exercise in which I have (n, m) matrix
and the result I want is (1 , m) and (n , 1).

mental girder Jun 26, 2022, 3:55 PM

#

I just switched computers so im missing all my programming tools

wooden sail Jun 26, 2022, 3:55 PM

#

all right. a good thing to keep in mind then, is that it's better to manage your packages using conda instead if pip

mental girder Jun 26, 2022, 3:55 PM

#

huh. alright, thanks

wooden sail Jun 26, 2022, 3:56 PM

#

like conda install xxxx instead of using pip

wooden sail Jun 26, 2022, 3:56 PM

#

ancient pendant Hi I am a begineer and need some help here I am doing one exercise in which I ha...

can you give more details?

ancient pendant Jun 26, 2022, 3:56 PM

#

Yes wait!

#

#

@wooden sail

wooden sail Jun 26, 2022, 4:05 PM

#

mhm

#

are you familiar with slice notation?

ancient pendant Jun 26, 2022, 4:07 PM

#

yes

wooden sail Jun 26, 2022, 4:08 PM

#

i would use a mix of that and list comprehension

primal shuttle Jun 26, 2022, 4:09 PM

#

@mental girder another alternative to conda is a tool like poetry, which also allows you to handle package dependencies really well

wooden sail Jun 26, 2022, 4:10 PM

#

something like this ```py
In [7]: import numpy as np

In [8]: M = np.array([[1,1,1],[3,2,1]])

In [10]: [M[np.array([i]), :] for i in range(M.shape[0])]
Out[10]: [array([[1, 1, 1]]), array([[3, 2, 1]])]

#

that's just one way of doing it

ancient pendant Jun 26, 2022, 4:11 PM

#

wooden sail i would use a mix of that and list comprehension

This is my solution which gives (4, ) matrix but not (4, 1)
so I was thinking using newaxis method
Will it work?

wooden sail Jun 26, 2022, 4:11 PM

#

right, that's pretty much what i was going to suggest as an alternative to what i shared above

ancient pendant Jun 26, 2022, 4:11 PM

#

wooden sail something like this ```py In [7]: import numpy as np In [8]: M = np.array([[1,1...

Oh okay👍

wooden sail Jun 26, 2022, 4:12 PM

#

your method is technically correct, too

ancient pendant Jun 26, 2022, 4:12 PM

#

Okay so I just need to figure out how to use newaxis method
am i in right direction?

wooden sail Jun 26, 2022, 4:12 PM

#

yes, that would work

ancient pendant Jun 26, 2022, 4:13 PM

#

Okay Thanks!

wooden sail Jun 26, 2022, 4:13 PM

#

i would also say that a super clean way to make your code look hot would be to call get_rows on a.T inside of get cols, instead of coding a loop that looks identical to what's in the other function

#

but that's just style

ancient pendant Jun 26, 2022, 4:14 PM

#

Oh yes Thanks🎉

wooden sail Jun 26, 2022, 4:14 PM

#

and regarding np.newaxis: ```py
In [12]: x = np.array([1,2,3,4,5,6,7])

In [13]: x[:, np.newaxis]
Out[13]:
array([[1],
[2],
[3],
[4],
[5],
[6],
[7]])

In [14]: x[np.newaxis,:]
Out[14]: array([[1, 2, 3, 4, 5, 6, 7]])

ancient pendant Jun 26, 2022, 4:16 PM

#

Yes Thankyou @wooden sail 🙏

primal shuttle Jun 26, 2022, 4:18 PM

#

A way to check it (if you're interested) is to use .flags on the objects to see that the numpy method for transposition is indeed O(1), where you will see that np.transpose keeps the matrices represented as blocks of contiguous memory (as if they were a one-dimensional array) - so the memory doesn't change, it's only the axis that does - hence np.newaxis works as well

wooden sail Jun 26, 2022, 4:20 PM

#

that's real chad advice

primal shuttle Jun 26, 2022, 4:20 PM

#

@wooden sail chad? 😉

wooden sail Jun 26, 2022, 4:22 PM

#

especially cuz it shows you that stuff like newaxis also doesn't make copies, super nice

steady basalt Jun 26, 2022, 4:31 PM

#

try plot this

primal shuttle Jun 26, 2022, 4:37 PM

#

@steady basalt you have too much time on your hands 😉

steel flax Jun 26, 2022, 4:47 PM

#

wooden sail might as well do a little <@300261303392534528> to make sure you find the messag...

thankssss a loooot :>

bronze jacinth Jun 26, 2022, 4:47 PM

#

im trying to follow a tutorial for an opencv project. it requires tensorflow(inexperienced) and it throws this (i have already installed tensorflow). how to fix this?

primal shuttle Jun 26, 2022, 4:48 PM

#

What's the actual error?

bronze jacinth Jun 26, 2022, 4:49 PM

#

primal shuttle Jun 26, 2022, 4:49 PM

#

Oh dear it’s windows

bronze jacinth Jun 26, 2022, 4:49 PM

#

🥲

#

do you recommed doing this in linux? I have to learn it anyways for my course

primal shuttle Jun 26, 2022, 4:50 PM

#

Dear @bronze jacinth - yes, yes, a thousand times yes

bronze jacinth Jun 26, 2022, 4:51 PM

#

yessir on it, will delay this project. thanks!

primal shuttle Jun 26, 2022, 4:51 PM

#

I can tell you that you have wsl available on windows

#

Windows Subsystem for Linux

wooden sail Jun 26, 2022, 4:52 PM

#

right, wsl is a great place to start

bronze jacinth Jun 26, 2022, 4:52 PM

#

yes i tried doing that but eventually installed a vm

primal shuttle Jun 26, 2022, 4:53 PM

#

Cool, if you have a VM just make sure you have a GPU pass through so you can use it for your tensorflow

bronze jacinth Jun 26, 2022, 4:53 PM

#

i will look into that

#

thanks again!

primal shuttle Jun 26, 2022, 4:53 PM

#

Anytime 🙂

#

@bronze jacinth one more question - which hypervisor are you using?

#

I mean Virtualbox, VMware, something else ...

bronze jacinth Jun 26, 2022, 4:54 PM

#

virtualbox

primal shuttle Jun 26, 2022, 4:54 PM

#

Cool - that should make things easier

bronze jacinth Jun 26, 2022, 4:54 PM

#

but my friend who's a little more experienced is trying to get me to dual boot

primal shuttle Jun 26, 2022, 4:55 PM

#

I'm not sure that's really required, unless you have solid reasons to do it this way

#

Usually a simple VM should do

bronze jacinth Jun 26, 2022, 4:55 PM

#

i eventually also plan on learning ROS, and im not sure what all is required for that (software wise)

primal shuttle Jun 26, 2022, 4:56 PM

#

what's ROS?

wooden sail Jun 26, 2022, 4:56 PM

#

i dual boot and i'd still recommend wsl or wm instead (depending on what the end goal is). at the end of the day, you won't run any hardcore stuff on your own hardware, so all you need is a suitable environment to do more or less realistic tests before deploying them somewhere else

bronze jacinth Jun 26, 2022, 4:56 PM

#

primal shuttle what's ROS?

Robot Operating System

primal shuttle Jun 26, 2022, 4:57 PM

#

Ah - that's out of my scope unfortunately 😦

bronze jacinth Jun 26, 2022, 4:58 PM

#

no problem, youve helped plenty

#

my next doubts will be linux/tensorflow related xD

primal shuttle Jun 26, 2022, 4:58 PM

#

🙂 that's the fun bit - but that's already about 1000 times easier than win

#

But this may be my traumas from the past talking 😉

bronze jacinth Jun 26, 2022, 5:00 PM

#

hmm

primal shuttle Jun 26, 2022, 5:01 PM

#

By means of entertainment, you can also compose your setup through docker

#

But that's for another day 🙂

bronze jacinth Jun 26, 2022, 5:07 PM

#

one step at a time

steady basalt Jun 26, 2022, 5:18 PM

#

bronze jacinth do you recommed doing this in linux? I have to learn it anyways for my course

mac has bash

#

mac is the bes tos

primal shuttle Jun 26, 2022, 5:20 PM

#

Mac has zsh as standard

#

Be careful with that, they are not 100% equivalent

steady basalt Jun 26, 2022, 5:21 PM

#

works great for me

primal shuttle Jun 26, 2022, 5:25 PM

#

mkdir -p ~/{one, two}

#

See what you get

steady basalt Jun 26, 2022, 5:30 PM

#

No

#

BTW when standardising, if X is exposur and Y is outcome, what is L?

#

intervention?

#

currently studying g methods

#

oh its risk group i think

#

or its just a confounder?

pseudo wren Jun 26, 2022, 5:45 PM

#

is anyone familiar with the library rasa

#

i need help

#

this project is a huge undertaking and the first part is literally just installing rasa

#

which will not work for some reason

primal shuttle Jun 26, 2022, 5:58 PM

#

dontasktoask . com 🙂

pseudo wren Jun 26, 2022, 6:00 PM

#

the question was how did you handle installing rasa because installing it has been giving me trouble :)

primal shuttle Jun 26, 2022, 6:01 PM

#

What's the trouble 🙂

mild dirge Jun 26, 2022, 6:02 PM

#

pseudo wren the question was how did you handle installing rasa because installing it has be...

https://rasa.com/docs/rasa/installation/

Installation

Install Rasa Open Source on premises to enable local and customizable Natural Language Understanding and Dialogue Management.

pseudo wren Jun 26, 2022, 6:03 PM

#

I worked through the installation guide

#

that was the first thing i did

#

the thing is there's so many packages abstracted in the actual thing

mild dirge Jun 26, 2022, 6:04 PM

#

So where did it go wrong lol, wassp isn't just asking it for a laugh

pseudo wren Jun 26, 2022, 6:04 PM

#

that the run time is extremely wrong

#

I attempted to install rasa in full

#

and it did not load for 3 hours

#

this was through colab

#

i then attempted to install it through command line

#

and it errored out due to an issue with the dependancies

#

it cited it being an issue with the package

wooden sail Jun 26, 2022, 6:31 PM

#

what kind of issue?

#

maybe saying you were missing some build tools?

fiery vigil Jun 26, 2022, 6:56 PM

#

Hi, had a question about eigenvalue solvers in numpy: is there a version of np.linalg.eig that will solve complex symmetric matrices? (not complex Hermitian matrices, for which we have np.linalg.eigh)

wooden sail Jun 26, 2022, 7:00 PM

#

a complex symmetric matrix is not a special kind of matrix, as far as i recall

#

i lied, i always forget the autonne-takagi factorization

primal shuttle Jun 26, 2022, 7:04 PM

#

You mean np.linalg.eigvalsh?

#

If takagi factorisation is what you're after, it involves constructing a Hermitian as its step

wooden sail Jun 26, 2022, 7:06 PM

#

In [1]: import numpy as np

In [2]: M = np.array([[1, 1j],[1j, 1]])

In [3]: np.linalg.eigvalsh(M)
Out[3]: array([0., 2.])

In [4]: M
Out[4]:
array([[1.+0.j, 0.+1.j],
       [0.+1.j, 1.+0.j]])

In [5]: np.linalg.eigvals(M)
Out[5]: array([1.+1.j, 1.-1.j])

#

using eigvalsh yields the wrong result, since it's not hermitian, just complex symmetric

primal shuttle Jun 26, 2022, 7:07 PM

#

then I'm not sure, would have to look deeper

wooden sail Jun 26, 2022, 7:08 PM

#

i'm fairly sure most solvers don't have an optimized diagonalizer for these kinds of matrices... or at least i've never seen one

#

but it could be the case that doing takagi by building that intermediate hermitian mat yourself and using eigvalsh is faster than using vanilla eigvals

#

hmmm nah there's a simultaneous diagonalization step

fiery vigil Jun 26, 2022, 7:20 PM

#

wooden sail a complex symmetric matrix is not a special kind of matrix, as far as i recall

The way cupy (or numpy) defines their solver cp.linalg.eigh. or np.linalg.eigh is:

"Return the eigenvalues and eigenvectors of a complex Hermitian (conjugate symmetric) or a real symmetric matrix."

#

So there is some limitation (and from what I tried, it does give wrong results).

fiery vigil Jun 26, 2022, 7:20 PM

#

wooden sail hmmm nah there's a simultaneous diagonalization step

so is this possible, using this "takagi" process?

wooden sail Jun 26, 2022, 7:21 PM

#

it is, but you can also just remove the h and you're set

#

or maybe... cupy suffers from the same thing as jax, where non hermitian matrices can only be diagonalized on cpu?

fiery vigil Jun 26, 2022, 7:21 PM

#

lol, if only that worked

#

there's no cupy.linalg.eig

wooden sail Jun 26, 2022, 7:22 PM

#

aha

fiery vigil Jun 26, 2022, 7:22 PM

#

wooden sail or maybe... cupy suffers from the same thing as jax, where non hermitian matrice...

yeah, it just throws and error, and then I am left to use numpy.linalg.eig, which is visibly slower

pseudo wren Jun 26, 2022, 7:23 PM

#

wooden sail maybe saying you were missing some build tools?

when you import the library the built tools also import with it. the thing is, the rasa library contains a lot of sub libraries like matplotlib, numpy, and tensorflow. Because of this, the library is huge, and to import what you need, you have to sift through all the sub libraries. it's a little tedious.

fiery vigil Jun 26, 2022, 7:23 PM

#

What is this takagi process? Will it help me use cupy.linalg.eigh but on complex symmetric matrices?

wooden sail Jun 26, 2022, 7:23 PM

#

https://en.wikipedia.org/wiki/Symmetric_matrix#Complex

#

it can help you find the singular values of your matrix by doing eigenvalue decompositions on a few intermediate hermitian matrices

#

but as i mentioned, complex symmetric matrices are not really "special" and are in general not diagonalizable the usual way

#

i'd say to just use the SVD directly instead

fiery vigil Jun 26, 2022, 7:25 PM

#

oh, you mean special like that: I though you meant, "they are no different than other matrices, so it should be usable"

fiery vigil Jun 26, 2022, 7:25 PM

#

wooden sail i'd say to just use the SVD directly instead

what new hell is that? 😅

wooden sail Jun 26, 2022, 7:25 PM

#

they are no different than other generic matrices, meaning eigvalsh does NOT work on them

fiery vigil Jun 26, 2022, 7:26 PM

#

oh I saw the SVD, forgot about it

slow tapir Jun 26, 2022, 7:26 PM

#

hi y'all, I need some help with a data science task with working with support vector machines. I wasnt sure if I should post that into a help channel or just here because its some kind of a longer task 😅

fiery vigil Jun 26, 2022, 7:27 PM

#

wooden sail i'd say to just use the SVD directly instead

How can I ensure, matrices u and v are inverse of each other?

wooden sail Jun 26, 2022, 7:27 PM

#

they aren't in general, and you can't

#

that's the whole point of what i'm saying

#

there is no guarantee your matrix is diagonalizable because it's not a special matrix

fiery vigil Jun 26, 2022, 7:28 PM

#

It does diagonalize with np.linalg.eig, no problems there

#

just needed a cupy version of that, to distribute the computation over GPU

wooden sail Jun 26, 2022, 7:29 PM

#

what exactly do you need the eigenvalue decomp for? if i may ask

#

cuz i don't think there's a good solution for this

fiery vigil Jun 26, 2022, 7:30 PM

#

These are modes of a system, in which I have to solve the overall problem.

#

I will use these eigenvectors to write a general superposition for any state of the system

#

so have to get both the eigenvalues and eigenvectors right

#

can I write decorators in cupy, like those for numba, jit?

wooden sail Jun 26, 2022, 7:35 PM

#

should be doable

fiery vigil Jun 26, 2022, 7:35 PM

#

yeah, I guess wherever I can squeeze out some efficiency. It's quite a letdown that cupy didn't bother with an equivalent of general solver np.linalg.eig

#

thanks for the clarification, it saved me time that would be wasted looking at stuff that wouldn't have worked

wooden sail Jun 26, 2022, 7:37 PM

#

yeah i can't find any clever workaround

fiery vigil Jun 26, 2022, 7:38 PM

#

oh no worries, eigenvalue solvers always have some catch. At least numpy has a general purpose, complex eigenvalue solver!

wooden sail Jun 26, 2022, 7:39 PM

#

if you're willing to try something different, there's a chance the jax eig function does work on gpu, maybe i'm just misremembering

#

gimme a second to test

fiery vigil Jun 26, 2022, 7:41 PM

#

says "symmetric/Hermitian matrices", now the question remains if it is complex symmetric or just real symmetric 😅

wooden sail Jun 26, 2022, 7:43 PM

#

#

sadly i remembered correctly

#

same issue

fiery vigil Jun 26, 2022, 7:44 PM

#

ah, okay. Well, that's a lot of time saved, again. I was going to dig into this jax and see how to implement it.

#

snail-pace numpy it is 😔

wooden sail Jun 26, 2022, 7:45 PM

#

🐌

#

best of luck

wheat snow Jun 26, 2022, 7:45 PM

#

anybody could help me out rq?

fiery vigil Jun 26, 2022, 7:45 PM

#

wooden sail best of luck

thanks!

wheat snow Jun 26, 2022, 7:46 PM

#

wheat snow <@836605577400549436> I was thinking of a way to plot (in a specific period of t...

here

#

its about netflix watchdata

primal shuttle Jun 26, 2022, 7:47 PM

#

slow tapir hi y'all, I need some help with a data science task with working with support ve...

What’s up with the SVM

slow tapir Jun 26, 2022, 7:52 PM

#

primal shuttle What’s up with the SVM

I just got a data set and I need to perform calssification using an SVM, make a training, dev and test set out of it etc. im just kinda lost there

primal shuttle Jun 26, 2022, 7:53 PM

#

And what specifically are you having trouble with? The splitting part or the actual SVM part?

slow tapir Jun 26, 2022, 7:54 PM

#

right now the splitting part, probably after thats done I will have trouble with the SVM part aswell

primal shuttle Jun 26, 2022, 7:55 PM

#

Ok, for SVM I'd suggest using stratified sampling

#

Alternately you could use k-folds

#

I'll leave it at that, see which one fits your data more and then once it's split I'll answer the SVM questions - fair?

slow tapir Jun 26, 2022, 7:56 PM

#

sounds good, I'll try 🙂

primal shuttle Jun 26, 2022, 7:56 PM

#

🙂 hint: it has to do with class imbalance, if your dataset indeed suffers from that 🙂

steady basalt Jun 26, 2022, 8:09 PM

#

slow tapir I just got a data set and I need to perform calssification using an SVM, make a ...

have u tried using the split method from sklearn

#

theres 0 skill required to run data on a sklearn svm so you shud find no problems

slow tapir Jun 26, 2022, 8:15 PM

#

df1 = df[["Height (cm)", "Age", "Sex", "DoesGroceries"]]
df1.sort()
random.seed(230)

split_1 = int(0.6 * len(df1)) # 06. of 1.0 is train
split_2 = int(0.8 * len(df1)) # mid between 0.6 and 1.0 is 0.8 for 2x 0.2

train_data = df1[:split_1] # train 0 to 0.6 
dev_data = df1[split_1: split_2] # dev 0.6 to 0.8 = 0.2
test_data = df1[split_2:] # test 0.8 to 1.0 = 0.2

#

thats what I got now for splitting

#

so train should be 60%, dev 20% and test 20% aswell

primal shuttle Jun 26, 2022, 8:18 PM

#

Have you checked that the sizes indeed reflect that?

serene scaffold Jun 26, 2022, 8:20 PM

#

slow tapir ```py df1 = df[["Height (cm)", "Age", "Sex", "DoesGroceries"]] df1.sort() random...

your three statements at the bottom are slicing along the columns. use iloc if you want to slice by row position

slow tapir Jun 26, 2022, 8:21 PM

#

primal shuttle Have you checked that the sizes indeed reflect that?

yes I checked that per hand, should I implement three functions that calculate it?

slow tapir Jun 26, 2022, 8:21 PM

#

serene scaffold your three statements at the bottom are slicing along the columns. use `iloc` if...

what does slicing by row positions mean?

primal shuttle Jun 26, 2022, 8:21 PM

#

I'm gonna help you with that

serene scaffold Jun 26, 2022, 8:22 PM

#

slow tapir what does slicing by row positions mean?

if you just do df[ ], you're picking columns, not rows. looks like wassp can walk you through it.

slow tapir Jun 26, 2022, 8:23 PM

#

serene scaffold if you just do `df[ ]`, you're picking columns, not rows. looks like wassp can...

hmm but its throwing me an error if I just use one pair of sq brackets

primal shuttle Jun 26, 2022, 8:23 PM

#

that is true

slow tapir Jun 26, 2022, 8:24 PM

#

I should pick those columns from that dataframe. so I think [[]] is correct

primal shuttle Jun 26, 2022, 8:24 PM

#

Yes, you want a list of columns

#

So if you're creating a data frame from a larger set, then these will pick the columns with all their rows

slow tapir Jun 26, 2022, 8:25 PM

#

yes! thats what I wanted/need there:D

primal shuttle Jun 26, 2022, 8:25 PM

#

So it will be a matrix of n rows and 4 cols

#

Yup I'm with you 🙂

#

I'm writing code for you - hold on 🙂

#

Are you familiar with the sklearn package? And train_test_split?

slow tapir Jun 26, 2022, 8:26 PM

#

im just wondering what random.seed(230) does, what is the number 230 for? (just picked it somewhere from the internet lol)

primal shuttle Jun 26, 2022, 8:26 PM

#

Ok

#

A seed is for experiment reproduction

slow tapir Jun 26, 2022, 8:26 PM

#

primal shuttle Are you familiar with the sklearn package? And train_test_split?

yes I heard about that too but I wasnt sure how to use it there

primal shuttle Jun 26, 2022, 8:26 PM

#

I'm gonna show you

#

But first - the seed

slow tapir Jun 26, 2022, 8:26 PM

#

awesome:D

primal shuttle Jun 26, 2022, 8:26 PM

#

the 230 bears no particular meaning in this instance, in can be any number

#

value

#

It serves the reproduction purpose

#

So for example if I want to reproduce your random splits exactly the same way as it has split for you, I need to use the same seed

#

Does that make sense?

slow tapir Jun 26, 2022, 8:27 PM

#

alright, I understand

primal shuttle Jun 26, 2022, 8:28 PM

#

In a more technical sense, it "saves" the state of a random function

#

Anyway

#

I'll post the code for you, and you'll let me know what it does, ok?

slow tapir Jun 26, 2022, 8:29 PM

#

alright, so it makes sense that I keep the random.seed() function, right?

slow tapir Jun 26, 2022, 8:29 PM

#

primal shuttle I'll post the code for you, and you'll let me know what it does, ok?

alright

primal shuttle Jun 26, 2022, 8:29 PM

#

slow tapir alright, so it makes sense that I keep the random.seed() function, right?

Correct - otherwise the RNG (your random function) will split your data differently

#

If you run your code again

slow tapir Jun 26, 2022, 8:29 PM

#

I see, that makes sense

primal shuttle Jun 26, 2022, 8:30 PM

#

And it doesn't matter that it's 230, could be 1, or 12345

#

test_size = 0.2
dev_size = 0.2

X_train, X_temp, y_train, y_temp = train_test_split(X, y,  test_size = test_size + dev_size)

X_test, X_dev, y_test, y_dev = train_test_split(X_temp, y_temp,                      test_size = dev_size / (test_size + dev_size))

#

(I hope I haven't made a booboo, I haven't tested it)

#

If you want to demo the seed, here is how you can do it

#

import random
random.seed(3)
print(random.randint(1, 1000))
random.seed(3)
print(random.randint(1, 1000))
print(random.randint(1, 1000))

slow tapir Jun 26, 2022, 8:35 PM

#

primal shuttle (I hope I haven't made a booboo, I haven't tested it)

alright, so I need to replace X and y, I would guess X is my current dataframe with the new columns, but what is y?

primal shuttle Jun 26, 2022, 8:37 PM

#

your label

slow tapir Jun 26, 2022, 8:37 PM

#

ahhh

#

okay

#

so my label is in my case isOverweight because the first point of the task was to classify if a person is overweight or not

primal shuttle Jun 26, 2022, 8:38 PM

#

Yup

slow tapir Jun 26, 2022, 8:38 PM

#

and at the start it had true/false values and I converted them to 1/0 values

#

that was correct labeling, right?

primal shuttle Jun 26, 2022, 8:39 PM

#

Cool, yup - python will calculate both 0 and 1 and True and False just the same - False = 0, True = 1

#

If you were to sum them for example

#

As a bonus, train_test_split has a random_state option, which is equivalent to our seed

#

🙂

#

And if your labels / classes are imbalanced, it also conveniently comes with the stratify option for your pleasure 🙂

slow tapir Jun 26, 2022, 8:43 PM

#

alright, very nice:D

#

not getting any errors

primal shuttle Jun 26, 2022, 8:43 PM

#

Phew - yay!

slow tapir Jun 26, 2022, 8:44 PM

#

so it makes sense to use those two options aswell? or at least random_state ?

primal shuttle Jun 26, 2022, 8:44 PM

#

One of them is enough

slow tapir Jun 26, 2022, 8:44 PM

#

stratify wouldnt make sense I think, I dont think that my labels or classes are imbalanced

primal shuttle Jun 26, 2022, 8:44 PM

#

The stratify option is not a case of "true / false" - it requires a little bit more thinking

#

But if it was, you can handle it therein

#

Also, since you're only using the seed for splitting (I don't think you'd use it anywhere else, in your case), it's better to include it in the train_test_split, rather than separately

#

Makes for a cleaner code

slow tapir Jun 26, 2022, 8:46 PM

#

alright

primal shuttle Jun 26, 2022, 8:47 PM

#

And for the pleasure of most people I hang out with, the value for seed is 42 🙂

slow tapir Jun 26, 2022, 8:47 PM

#

I chose 69 :DD

primal shuttle Jun 26, 2022, 8:47 PM

#

😛

#

Cool

#

So your data is split correctly, I assume?

slow tapir Jun 26, 2022, 8:48 PM

#

I do hope, will check real quick

primal shuttle Jun 26, 2022, 8:48 PM

#

Awesome - my point exactly

slow tapir Jun 26, 2022, 8:49 PM

#

im just a bit confused with those X_train, X_temp, y_train, y_temp variables at the beginning

primal shuttle Jun 26, 2022, 8:49 PM

#

What about them?

slow tapir Jun 26, 2022, 8:50 PM

#

im not using the X_temp anywhere

#

oh, its in the second function

primal shuttle Jun 26, 2022, 8:50 PM

#

X_test, X_dev, y_test, y_dev = train_test_split(X_temp, y_temp,                      test_size = dev_size / (test_size + dev_size))

#

You are

#

🙂

#

These are all variables, so you can print them at every step

#

Or view them in whatever way you want

#

If it's easier for you to visualise

slow tapir Jun 26, 2022, 8:52 PM

#

yeah I will do that 🙂

primal shuttle Jun 26, 2022, 8:52 PM

#

🙂

#

It will be better for you to start with a small set as well, to grasp the splits as a whole

#

So take 10 observations and see how it works

misty flint Jun 26, 2022, 8:57 PM

#

https://pennylane.ai/

PennyLane

A Python library for quantum machine learning, automatic differentiation, and optimization of hybrid quantum-classical computations. Use multiple hardware devices, alongside TensorFlow or PyTorch, in a single computation.

#

CLf_HyperThonk

primal shuttle Jun 26, 2022, 8:57 PM

#

@slow tapir any other questions? I'm about to call it a day 🙂

slow tapir Jun 26, 2022, 8:58 PM

#

hmm well so far what youve sent me works:D very nice thanks a lot:D

#

now I need to stratify the data by sex and isOverweight

#

but I think I can just add it myself to the functions

#

xD

primal shuttle Jun 26, 2022, 8:59 PM

#

Ok you want to stratify at the point of splitting, if indeed your classes are imbalanced

slow tapir Jun 26, 2022, 9:00 PM

#

and then the next step would be using a linear SVM for classification

primal shuttle Jun 26, 2022, 9:00 PM

#

Is that your task? or your choice?

slow tapir Jun 26, 2022, 9:00 PM

#

but if you need to go, I will figure out somehow:D

#

thats the task there

primal shuttle Jun 26, 2022, 9:00 PM

#

Ah ok

slow tapir Jun 26, 2022, 9:02 PM

#

I think that works with the module StandardScaler and LinearSVC

#

from the lib sklearn

primal shuttle Jun 26, 2022, 9:02 PM

#

Yes

slow tapir Jun 26, 2022, 9:02 PM

#

alright, perfect

primal shuttle Jun 26, 2022, 9:03 PM

#

One more tip: scale only the train set

slow tapir Jun 26, 2022, 9:03 PM

#

okay, is there a specific reason to it?

primal shuttle Jun 26, 2022, 9:03 PM

#

If you scale the test set, it's not "unseen" anymore

slow tapir Jun 26, 2022, 9:03 PM

#

ohhh

primal shuttle Jun 26, 2022, 9:03 PM

#

In simple terms

slow tapir Jun 26, 2022, 9:03 PM

#

okay

#

I understand

primal shuttle Jun 26, 2022, 9:04 PM

#

🙂 good luck!

slow tapir Jun 26, 2022, 9:04 PM

#

thank you so much for helping out!:D

primal shuttle Jun 26, 2022, 9:04 PM

#

Pleasure 🙂

pseudo wren Jun 26, 2022, 11:08 PM

#

So I realized that I needed to install a specific version of Rasa to get the library to work in google collab. I imported the Rasa library, and then got hit with an error saying that I couldn't use tensor symbols with numpy. I subsequently upgraded my numpy version to 1.19.5 and then restarted the run time so that the changes could be implemented. Each time i've done that, the run time could no longer connect in my colab notebook despite restarting the browser, notebook, and creating new ones.

#

I've also changed the order in which i installed the packages thinking that would change it.

steady basalt Jun 26, 2022, 11:37 PM

#

Does python statsmodels let u do propensity scores and ipw

misty flint Jun 27, 2022, 1:18 AM

#

ive never used rasa in colab before and only locally with my ide

#

pithink

austere swift Jun 27, 2022, 2:36 AM

#

So I'm trying to train this model on two different datasets simultaneously, but the problem I'm having is that the model can't have data from both sets in the same batch (each batch has to either be completely the first dataset or completely the second). What would be the best way to have the dataset shuffled randomly while still maintaining that each batch contains only samples from one of the datasets?

#

would it be better to have it just alternate between datasets or should I just have it randomly select a dataset each iteration

#

well actually neither of those options would be ideal because I wouldn't be able to reproducibly get the same data if i put in the same index

#

actually what I could just do is see if the first value in the batch indices is odd or even, and select the dataset based on that

grave knoll Jun 27, 2022, 5:50 AM

#

Suggestions on good certification course for Big Data?

ancient pendant Jun 27, 2022, 7:25 AM

#

Hello @wooden sail this is my solution.

#

But server which checks answer is saying my dimensions are wrong

#

wooden sail Jun 27, 2022, 7:26 AM

#

huh, it looks like they lied

ancient pendant Jun 27, 2022, 7:26 AM

#

My dimension is (4, ) but not (4,1)

wooden sail Jun 27, 2022, 7:26 AM

#

maybe they don't want the np.newaxis

#

oh i read it backwards

#

or did i? can you try removing the newaxis?

ancient pendant Jun 27, 2022, 7:27 AM

#

Okay wait

#

Yeah now its right!

#

But I didn't understand can you explain

wooden sail Jun 27, 2022, 7:28 AM

#

ok, the task description was wrong, then

#

the thing is that the np ndarray data type does not actually support "true" vectors

#

if you transpose a 1d array, you get the same 1d array back

#

that also means that you can multiply the same vector to the left or right of a matrix

#

it seems to me that whoever wrote the task description was not aware of that or chose to ignore it

#

what i mean to say is that, using 1d arrays, there is no distinction between row and column vectors

#

you should mention this to whoever designed the task

#

the description does not match the tests

ancient pendant Jun 27, 2022, 7:33 AM

#

Man I wasted 1 week thinking about this problem, Thanks🙏
I was doubting myself like I am not made for data analysis,
why my brain is not working and now I found out there description was wrong😅 .

ancient pendant Jun 27, 2022, 7:33 AM

#

wooden sail you should mention this to whoever designed the task

Yes I am going to

wooden sail Jun 27, 2022, 7:35 AM

#

that one was 100% not your fault, the tests and the description don't match

ancient pendant Jun 27, 2022, 7:38 AM

#

Man I am feeling so embarrsed right now I accidently checked another solution🥲 😂

#

My anser is only 83% right

#

This is the problem

wooden sail Jun 27, 2022, 7:41 AM

#

seems like you had to remove it from both

ancient pendant Jun 27, 2022, 7:43 AM

#

Now I again added np.newaxis to both like in first picture i showed you
Now its right 100%

#

maybe server problem

#

this is staff solution
they did it so cool

austere swift Jun 27, 2022, 8:04 AM

#

austere swift actually what I could just do is see if the first value in the batch indices is ...

I found a better way, I modified the dataset to take a tuple of (dataset_idx, idx) then modified the batch sampler to give the dataloader these indices

grizzled stump Jun 27, 2022, 8:47 AM

#

Hey guys.
Do you guys feel like Spyder performs better than VS Code, in terms of code execution? Or am I just tripping out?

wooden sail Jun 27, 2022, 8:48 AM

#

there shouldn't be much of a difference

pliant pewter Jun 27, 2022, 9:01 AM

#

I don't think either of them is responsible for actually executing code?

grizzled stump Jun 27, 2022, 9:01 AM

#

pliant pewter I don't think either of them is responsible for actually executing code?

Hmmm, you have a point.

#

I don't know. I just felt like my code completed execution in significantly lesser time than it did on VS Code, for reasons unknown.

wheat snow Jun 27, 2022, 9:47 AM

#

PepeCrySea

#

i formated a excel file

#

very easy to understand

#

#

depends on the shown information ig

grave moat Jun 27, 2022, 10:16 AM

#

Hey, does anyone know how can I plot on local host using plotly?

steady basalt Jun 27, 2022, 10:41 AM

#

no

#

why show them code at all?

#

show them a presentation or smtn

humble mist Jun 27, 2022, 12:29 PM

#

Hi does anyone have a dockerfile with tensorflow_1? Thank you in advance

upper spindle Jun 27, 2022, 12:51 PM

#

#

#

im trying to read in sample_prices into my jupyter lab

#

but keeps coming out with an error

lapis sequoia Jun 27, 2022, 12:59 PM

#

Click on the bar at the top of the file explorer to get the proper file path, it should be something like C:/Users... @upper spindle

upper spindle Jun 27, 2022, 1:00 PM

#

thanks @lapis sequoia

#

it comes out with this error

lapis sequoia Jun 27, 2022, 1:03 PM

#

Put an r in front like this:

That should fix it for you

upper spindle Jun 27, 2022, 1:03 PM

#

thank you so much @lapis sequoia

lapis sequoia Jun 27, 2022, 1:05 PM

#

No problem 👍

void lion Jun 27, 2022, 1:11 PM

#

which modules are good to learn in relation to AI?

serene scaffold Jun 27, 2022, 1:13 PM

#

void lion which modules are good to learn in relation to AI?

there aren't really libraries where you can learn AI by using them. you have to read about the theory, and then create things using multiple libraries that apply what you've learned.

#

I have an overview of the main libraries in the pins.

primal shuttle Jun 27, 2022, 1:53 PM

#

void lion which modules are good to learn in relation to AI?

Never learn a subject through a tool - always use tools to help you understand a subject

void lion Jun 27, 2022, 1:56 PM

#

i just framed my question wrong

#

i meant to say what modules in relation to AI are good to learn

arctic needle Jun 27, 2022, 1:57 PM

#

Does anyone know if learning about finances is important in data science/data analytics/BI careers? If yes do you recommend any free course?

serene scaffold Jun 27, 2022, 2:22 PM

#

void lion i meant to say what modules in relation to AI are good to learn

I wouldn't even focus on learning libraries. I would try to do something, and use libraries implicitly while trying to achieve that goal.

steady basalt Jun 27, 2022, 2:27 PM

#

arctic needle Does anyone know if learning about finances is important in data science/data an...

i dont think data science can be compared to most BI jobs

#

for Bi and analytics yes, for data science no

mild dirge Jun 27, 2022, 2:28 PM

#

what's Bi?

tacit basin Jun 27, 2022, 2:28 PM

#

arctic needle Does anyone know if learning about finances is important in data science/data an...

In one DS project i was part of for financial institution, the in depth knowledge of finance filed wasn't required but it was rather nice to have.
Free online MS in financial engineering https://www.wqu.edu/programs/mscfe/

MSc in Financial Engineering

Program page for the MSc in Financial Engineering

steady basalt Jun 27, 2022, 2:40 PM

#

mild dirge what's Bi?

I think its business intelligence

#

like, making bar charts and stuff

mild dirge Jun 27, 2022, 2:41 PM

#

It's like collecting information about a company and converting it into usable data?

steady basalt Jun 27, 2022, 2:41 PM

#

tacit basin In one DS project i was part of for financial institution, the in depth knowledg...

how can a MSc be free? isnt literally everyone and their grandma gona do these free online masters completely fucking the job market?

primal shuttle Jun 27, 2022, 2:54 PM

#

serene scaffold I wouldn't even focus on learning libraries. I would try to do something, and us...

My point exactly

humble mist Jun 27, 2022, 4:05 PM

#

humble mist Hi does anyone have a dockerfile with tensorflow_1? Thank you in advance

Push

steady basalt Jun 27, 2022, 4:23 PM

#

are those values in ur dataframe integers or strings

#

yeah... why?

#

u know what happens when u add two strings in python right? thats why its appending them and not summing them

#

ur welcome remember the think in terms of how python works and u will find the answer

misty flint Jun 27, 2022, 4:26 PM

#

hmm it depends on their background. if theyre STEM/technical at all, they might like to see some data but maybe a quick ppt would be better

#

PikaThink

#

do you know streamlit? you could probably make a quick demo that way as well

wheat snow Jun 27, 2022, 4:27 PM

#

PU_PepeRage

#

why is everything so complex

#

PU_MonkaMegaCry

steady basalt Jun 27, 2022, 4:28 PM

#

how do people get helper role btw?

serene scaffold Jun 27, 2022, 4:28 PM

#

wheat snow why is everything so complex

things are complicated before you understand it, and simple after.

steady basalt Jun 27, 2022, 4:28 PM

#

do you have to volunteer and commit to helping people?

serene scaffold Jun 27, 2022, 4:28 PM

#

steady basalt how do people get helper role btw?

see #roles

misty flint Jun 27, 2022, 4:29 PM

#

have you also considered something like using wordclouds? could do some NLP-lite and show most common words associated with each tweet/replies

wheat snow Jun 27, 2022, 4:29 PM

#

serene scaffold things are complicated before you understand it, and simple after.

maybe.... im still dying trying to figure out this

misty flint Jun 27, 2022, 4:29 PM

#

it really depends on what you think they are looking for though

steady basalt Jun 27, 2022, 4:29 PM

#

there are people in this channel that deserve a nobel prize for the amount of effort they put into help

wheat snow Jun 27, 2022, 4:29 PM

#

wheat snow <@836605577400549436> I was thinking of a way to plot (in a specific period of t...

@serene scaffold

steady basalt Jun 27, 2022, 4:29 PM

#

especially the math people

wheat snow Jun 27, 2022, 4:30 PM

#

steady basalt there are people in this channel that deserve a nobel prize for the amount of ef...

@hardy ledge for workin 5 hours straight at a game with me

steady basalt Jun 27, 2022, 4:30 PM

#

blessed channel i dont even go into any other rooms in this server lekl

serene scaffold Jun 27, 2022, 4:30 PM

#

steady basalt there are people in this channel that deserve a nobel prize for the amount of ef...

I keep an eye on people in this channel who I think would be a good candidate, but I can't guarantee how quickly those people will be put to a vote with the other mods/admins or what the outcome will be.

wheat snow Jun 27, 2022, 4:31 PM

#

@serene scaffold gotta admit, i like that siam in your banner

steady basalt Jun 27, 2022, 4:31 PM

#

@serene scaffold maybe most of them wudnt wana be a helper

serene scaffold Jun 27, 2022, 4:31 PM

#

wheat snow <@253696366952316929> gotta admit, i like that siam in your banner

he's a ragdoll

wheat snow Jun 27, 2022, 4:31 PM

#

serene scaffold he's a ragdoll

oh

#

mbmb

steady basalt Jun 27, 2022, 4:32 PM

#

the rest of the server is very much 'go to the help room to get help' but in this channel, its pure help only if asked

serene scaffold Jun 27, 2022, 4:32 PM

#

steady basalt the rest of the server is very much 'go to the help room to get help' but in thi...

well, this is the channel for data science help

misty flint Jun 27, 2022, 4:32 PM

#

yeah just try to keep the business questions they are trying to answer in mind. that will help guide the analyses you choose to do vs, and more importantly sometimes, the ones you choose not to do. if they're also a business person, i think slides would be good + having an executive summary at the beginning (tl;dr section)

steady basalt Jun 27, 2022, 4:32 PM

#

help?

#

i thought it was just the ds and ai cahnnel

#

oh

serene scaffold Jun 27, 2022, 4:33 PM

#

steady basalt help?

wheat snow Jun 27, 2022, 4:33 PM

#

@serene scaffold if you got some time, you could look over that thing i pinged and wanted to do rq

steady basalt Jun 27, 2022, 4:33 PM

#

damn i never noticed that

serene scaffold Jun 27, 2022, 4:33 PM

#

wheat snow <@253696366952316929> if you got some time, you could look over that thing i pin...

I'm busy, sorry

misty flint Jun 27, 2022, 4:33 PM

#

serene scaffold

eh sometimes i treat it more as the 'topical chat' part

#

Oopsies

serene scaffold Jun 27, 2022, 4:33 PM

#

more like tropical chat amirite

wheat snow Jun 27, 2022, 4:33 PM

#

serene scaffold I'm busy, sorry

not referring to now... just in general... the week maybe

steady basalt Jun 27, 2022, 4:33 PM

#

might start spending more time in algos and structs channel in a coupla months, scared for interviews

#

i suck so hard at them

#

i literally peak at lc easy

wheat snow Jun 27, 2022, 4:34 PM

#

interviews

misty flint Jun 27, 2022, 4:34 PM

#

good luck dude

#

monkaCHRIST

wheat snow Jun 27, 2022, 4:34 PM

#

nothin i need to worry about rn

steady basalt Jun 27, 2022, 4:34 PM

#

i havnt even learnt how to work with binary tree objects

wheat snow Jun 27, 2022, 4:34 PM

#

PU_PeepoChicken

steady basalt Jun 27, 2022, 4:34 PM

#

the best i got is a 70% passrate array question

misty flint Jun 27, 2022, 4:34 PM

#

what positions are you going to go for

steady basalt Jun 27, 2022, 4:34 PM

#

ds

misty flint Jun 27, 2022, 4:35 PM

#

have you also considered DE

steady basalt Jun 27, 2022, 4:35 PM

#

ive been told even for a da interview i was gona receive arrays and strings questions lmfao

misty flint Jun 27, 2022, 4:35 PM

#

since its supposedly hot rn

steady basalt Jun 27, 2022, 4:35 PM

#

as if that shit wud ever be useful on the job

misty flint Jun 27, 2022, 4:35 PM

#

or something

steady basalt Jun 27, 2022, 4:35 PM

#

nah, not interested in de

misty flint Jun 27, 2022, 4:35 PM

#

PikaThink

#

hmm

steady basalt Jun 27, 2022, 4:35 PM

#

even if it paid 10% more id prefer to do ds

#

de looks boring

misty flint Jun 27, 2022, 4:36 PM

#

what about MLE

#

PikaThink

steady basalt Jun 27, 2022, 4:36 PM

#

yep

#

but id need alot of xp to do that

misty flint Jun 27, 2022, 4:36 PM

#

hmm

steady basalt Jun 27, 2022, 4:36 PM

#

and also i do not personalyl believe im capable of it

misty flint Jun 27, 2022, 4:36 PM

#

ah

steady basalt Jun 27, 2022, 4:36 PM

#

ive seen real mle code

#

its beyond my level rn

#

plus the math

misty flint Jun 27, 2022, 4:37 PM

#

usually they have a SWE background i think

steady basalt Jun 27, 2022, 4:37 PM

#

yeah

#

i mean in 4-5 years id be down to be a mle

misty flint Jun 27, 2022, 4:37 PM

#

since its usually production level code

steady basalt Jun 27, 2022, 4:37 PM

#

its not like im gona stop learning

misty flint Jun 27, 2022, 4:37 PM

#

eh i think you could do it in 3 or less if you really try

steady basalt Jun 27, 2022, 4:37 PM

#

with a full time job id prob still in my spare time learn coding

misty flint Jun 27, 2022, 4:37 PM

#

since it seems like you know a lot

steady basalt Jun 27, 2022, 4:37 PM

#

nah not rly

misty flint Jun 27, 2022, 4:37 PM

#

steady basalt with a full time job id prob still in my spare time learn coding

you also have to consider you will be learning on the job too

steady basalt Jun 27, 2022, 4:38 PM

#

u prob saw earlier i cudnt even inverse a matrix

misty flint Jun 27, 2022, 4:38 PM

#

so you cant discount that component

steady basalt Jun 27, 2022, 4:38 PM

#

im still on the learning process early on

misty flint Jun 27, 2022, 4:38 PM

#

i dont hang out long term in this channel

#

Oopsies

steady basalt Jun 27, 2022, 4:38 PM

#

ahhh

misty flint Jun 27, 2022, 4:38 PM

#

i just come and go like the wind

steady basalt Jun 27, 2022, 4:38 PM

#

yeah someone was literally teaching me lin alg

#

i was unable to do elimiantion

misty flint Jun 27, 2022, 4:38 PM

#

anyway good luck on your interviews dude

#

i should get back to work

steady basalt Jun 27, 2022, 4:39 PM

#

cheers bud

#

gona grind the leetcode

#

and the random stats details they ask

#

speaking of stats

#

anyone got a TLDR on why g formula gives u same result as linear regression?

#

but different ci?

#

literally nowhere explains this stuff in a simple way its all papers

wooden sail Jun 27, 2022, 4:42 PM

#

what's "g formula"? what's ci?

steady basalt Jun 27, 2022, 4:42 PM

#

and how do you know if IPW results (different) are better

#

confidence intervals, sorry

#

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6074945/

PubMed Central (PMC)

An introduction to g methods

Robins’ generalized methods (g methods) provide consistent estimates of contrasts (e.g. differences, ratios) of potential outcomes under a less restrictive set of identification conditions than do standard regression methods (e.g. linear, logistic, ...

#

this would speak to you, edd

#

its estimating effect of exposure ? i think youd find it interesting

wooden sail Jun 27, 2022, 4:44 PM

#

nah i'm too tired for this today

#

i also take maths in medical papers with a heap of salt

steady basalt Jun 27, 2022, 4:45 PM

#

how come?

#

i think this is becoming mainstream methods now

wooden sail Jun 27, 2022, 4:45 PM

#

they're usually rediscoveries of old stuff with a new name

steady basalt Jun 27, 2022, 4:45 PM

#

haha

wooden sail Jun 27, 2022, 4:45 PM

#

a good meme one is one in which a person rediscovers the riemann sum

#

classic

steady basalt Jun 27, 2022, 4:46 PM

#

savage..

serene scaffold Jun 27, 2022, 4:46 PM

#

I wanna discover that that the ratio of a radius to the circumference is exactly two times pi

#

like what are the chances

steady basalt Jun 27, 2022, 4:50 PM

#

Any torch users in chat?

serene scaffold Jun 27, 2022, 4:50 PM

#

steady basalt Any torch users in chat?

remember what we discussed about asking to ask?

steady basalt Jun 27, 2022, 4:51 PM

#

Do the grad and backward functions alter variables??

#

I was following the tutorial they have that makes you try a backwards pass

#

And I call my earlier variable and it has now changed

sage inlet Jun 27, 2022, 5:03 PM

#

Can anyone help me how to extract all the professions from a text file using nltk or spacy?

steady basalt Jun 27, 2022, 5:04 PM

#

sage inlet Can anyone help me how to extract all the professions from a text file using nlt...

Do u have a list of professions u want or is this the problem

sage inlet Jun 27, 2022, 5:04 PM

#

steady basalt Do u have a list of professions u want or is this the problem

I don't have a list of profession, also if I do have how to train a simple model?

#

in nltk specifically

steady basalt Jun 27, 2022, 5:05 PM

#

Use probabilities of words following other sets of words

#

For example “he worked as a “

#

Would commonly then give you a professions

sage inlet Jun 27, 2022, 5:06 PM

#

yeah I will try it once

steady basalt Jun 27, 2022, 5:06 PM

#

I’m afraid I don’t have enough nlp experience to walk u thru it

sage inlet Jun 27, 2022, 5:06 PM

#

Also can't we find professions using NER ?

steady basalt Jun 27, 2022, 5:07 PM

#

What’s that?

sage inlet Jun 27, 2022, 5:07 PM

#

Named Entity Recognition

steady basalt Jun 27, 2022, 5:07 PM

#

Isn’t that the same as cntl F?

#

Like you’d need the list ?

sage inlet Jun 27, 2022, 5:08 PM

#

No we don't need the list actually, but nltk library has pretrained model that can categorize words as PERSON, ORGANIZATION etc

steady basalt Jun 27, 2022, 5:10 PM

#

Can try looking for words that come before organization or after ?

sage inlet Jun 27, 2022, 5:16 PM

#

yeah we can but I too don't have much experience in NLP to train a model from scratch 😦

steady basalt Jun 27, 2022, 5:19 PM

#

Great time to learn

#

NLP is next on my learn list after I’m done with uni

wooden sail Jun 27, 2022, 5:20 PM

#

it sounds more like they just need a regex or something like that

#

better ask in the python general channel

sage inlet Jun 27, 2022, 5:22 PM

#

I don't think we can use regex for extracting profession

steady basalt Jun 27, 2022, 5:23 PM

#

He wants nlp model

sage inlet Jun 27, 2022, 5:23 PM

#

steady basalt He wants nlp model

Precisely.

steady basalt Jun 27, 2022, 5:23 PM

#

Ur gona need at least some valid professions corpus

sage inlet Jun 27, 2022, 5:23 PM

#

steady basalt Ur gona need at least some valid professions corpus

Yeah Yeah

steady basalt Jun 27, 2022, 5:25 PM

#

I shud tell my friend who loves nlp to just use regex, irrelevant field

#

pydis_strong

wooden sail Jun 27, 2022, 5:27 PM

#

oh well, i'm prepared to be mistaken

steady basalt Jun 27, 2022, 5:29 PM

#

It’s a valid nlp problem

#

Extracting “profession”

bold timber Jun 27, 2022, 5:29 PM

#

hi, anyone can help me? why the amount of rows is so huge? I'm so wondering about that

steady basalt Jun 27, 2022, 5:29 PM

#

U gona need to explore date

#

Take a look how many

sage inlet Jun 27, 2022, 5:30 PM

#

bold timber hi, anyone can help me? why the amount of rows is so huge? I'm so wondering abou...

You can try dropping the NA values

bold timber Jun 27, 2022, 5:30 PM

#

sage inlet You can try dropping the NA values

I don't have missing value

steady basalt Jun 27, 2022, 5:31 PM

#

How long is ur date variable mate

bold timber Jun 27, 2022, 5:32 PM

#

steady basalt How long is ur date variable mate

the unique value of date is 1684

steady basalt Jun 27, 2022, 5:32 PM

#

I bet it’s glitched in like 25000 2017-0101

#

Unique doesn’t matter

#

It’s probably repeated a lot of times

#

Don’t group by but just show the full dataframe

#

Unique values doesn’t mean anything for dataframe length

bold timber Jun 27, 2022, 5:34 PM

#

steady basalt It’s probably repeated a lot of times

of course, the actual dataset is over 3 million. in this case, I try to grouping that by date and I'm so wonder why it does get over 65k whereas the unique of date only 1684

#

this is the actual dataframe

#

can u explain to me why when I group that it gets over 65k rows? @steady basalt

#

I'm so wondering how did it happen

steady basalt Jun 27, 2022, 5:39 PM

#

How many unique sales ?

bold timber Jun 27, 2022, 5:40 PM

#

steady basalt How many unique sales ?

over 370k

steady basalt Jun 27, 2022, 5:43 PM

#

You’ve given sales in each date group

#

The date range is only like 5 years?

bold timber Jun 27, 2022, 5:43 PM

#

steady basalt The date range is only like 5 years?

yeah

steady basalt Jun 27, 2022, 5:44 PM

#

That’s weird

bold timber Jun 27, 2022, 5:45 PM

#

steady basalt That’s weird

can you explain to me why?

steady basalt Jun 27, 2022, 5:51 PM

#

What if u don’t use to frame how big is the array

#

Did u get the groupby syntax right

tacit basin Jun 27, 2022, 5:51 PM

#

steady basalt how can a MSc be free? isnt literally everyone and their grandma gona do these f...

you need to have Bachelor’s Degree then this is 2 years free MS course 🙂

steady basalt Jun 27, 2022, 5:52 PM

#

No one doesn’t have a bachelors

#

In the uk like 70% of people have

tacit basin Jun 27, 2022, 5:53 PM

#

also 20 - 25 hours a week

#

i don't have that much time 🙂

steady basalt Jun 27, 2022, 5:53 PM

#

😅

tacit basin Jun 27, 2022, 5:53 PM

#

or rather 😦

steady basalt Jun 27, 2022, 5:54 PM

#

Sadly data science here is for masters or PhD only

#

Job markets here weird af

formal cape Jun 27, 2022, 5:55 PM

#

On the job experience in data science seems more relevant than any PhD or masters

#

To me at least

bold timber Jun 27, 2022, 5:55 PM

#

steady basalt Did u get the groupby syntax right

Like this? It's the same result that I get before

steady basalt Jun 27, 2022, 5:56 PM

#

formal cape On the job experience in data science seems more relevant than any PhD or master...

Yeah but if u don’t have a masters ur not gona be selected for most jobs

#

In the first place

#

To get that experience

formal cape Jun 27, 2022, 5:58 PM

#

Can you remove the mean() argument and see what you get

bold timber Jun 27, 2022, 5:59 PM

#

formal cape Can you remove the mean() argument and see what you get

I get an error

formal cape Jun 27, 2022, 5:59 PM

#

It could be that it tries to do some averaging based on both the date and onpromotion length and it returns something funky

#

keep to_frame(), just remove the mean()

bold timber Jun 27, 2022, 6:01 PM

#

formal cape keep to_frame(), just remove the mean()

it's same, I've tried before

formal cape Jun 27, 2022, 6:05 PM

#

What I would try is to create another data frame only with data , onpromotion and sales

#

Then I'd just write newdataframe.groupby([sales].mean for the new dataframe

timid kiln Jun 27, 2022, 6:06 PM

#

Hey folks, don't know if this is the right place for this question. If not, please gently direct me to the correct channel. Thank you.

I have a software program that has what's called a 'schematic'. On this are a series of dots and lines. I can get the coordinates of the dots, but I cannot get any information about the lines other than what dots they're connected to. What I need to be able to determine, using python, is if the line between two dots crosses another line. I was wondering if there was a module or library in python that has this kind of functionality built in?

I know enough about math to create the equation of the line and get it in slope-intercept form and then calculate if the lines intersect but, just wondering if I do that if I'll be reinventing the wheel.

Here's a representative diagram of what I would see in the software:

#

brb

wheat snow Jun 27, 2022, 6:08 PM

#

@bold timber since you are using the same date thingie as in my project, i have a question for that (short: I want the user to input 2 dates e.g. 2019-04-21 and 2019-11-21 and my programm should notice that this is like 7/8 months and should generate an average watchtime hours for each month( so 8 values))

formal cape Jun 27, 2022, 6:08 PM

#

Put each point of every line in a set and then use set intersection https://www.w3schools.com/python/ref_set_intersection.asp

W3Schools offers free online tutorials, references and exercises in all the major languages of the web. Covering popular subjects like HTML, CSS, JavaScript, Python, SQL, Java, and many, many more.

#

There's probably other ways but this is what first pops to mind

wheat snow Jun 27, 2022, 6:08 PM

#

wheat snow <@836605577400549436> I was thinking of a way to plot (in a specific period of t...

here is the exact thingie @bold timber

#

it is a netfliux watch time analyses btw

bold timber Jun 27, 2022, 6:13 PM

#

formal cape Then I'd just write ```newdataframe.groupby([sales].mean ``` for the new datafra...

it get an error

bold timber Jun 27, 2022, 6:14 PM

#

wheat snow <@786960616664727572> since you are using the same date thingie as in my project...

what the question is?

formal cape Jun 27, 2022, 6:14 PM

#

Did you write the entire line what's the error?

#

It shouldn't give an error

bold timber Jun 27, 2022, 6:15 PM

#

formal cape Did you write the entire line what's the error?

like this

formal cape Jun 27, 2022, 6:18 PM

#

Syntax is wrong sorry. It should be df_train.groupby(['sales']).mean()

bold timber Jun 27, 2022, 6:19 PM

#

formal cape Syntax is wrong sorry. It should be ```df_train.groupby(['sales']).mean()```

I'm sorry, that is my fault. This is the result is:

formal cape Jun 27, 2022, 6:20 PM

#

create a new data frame also

#

That only contains onpromotion and date

bold timber Jun 27, 2022, 6:22 PM

#

formal cape That only contains onpromotion and date

With groping certain column or not?

formal cape Jun 27, 2022, 6:23 PM

#

I think you can do a newdf = d_train.filter('date', 'onpromotion', 'sales', axis=1)

timid kiln Jun 27, 2022, 6:23 PM

#

formal cape Put each point of every line in a set and then use set intersection https://www....

Thank you!

bold timber Jun 27, 2022, 6:24 PM

#

formal cape I think you can do a ```newdf = d_train.filter('date', 'onpromotion', 'sales', a...

formal cape Jun 27, 2022, 6:25 PM

#

Then do the newdf.groupby(['sales']).mean()

#

This is weird

#

newdf = df_train[['date', 'onpromotion', 'sales'] ].copy()

bold timber Jun 27, 2022, 6:28 PM

#

Well yeah, I think I will get rid of my curiosity for that case now because I still have a lot of things to solve hahaha

but, thank you for the discussion👍

wooden sail Jun 27, 2022, 6:50 PM

#

there must be some library that does this automatically, but also doing the math by hand isn't that difficult at all. instead of writing the lines in slope-intercept form, you could write them in parametric form. for example, given points a and b, the parametric form of the line is given by f(t) = a + t(b-a) for t in the interval [0,1]. then for another pair of points c and d, we do the same and get g(u) = c + u(d - c). if you subtract these two equations, there should be a point parameterized by t and u for which the difference is 0. that's the point where the two segments intersect. it can be found by inverting a 2x2 matrix, which you can do by hand or using numpy or something of the sort. then, if both t and u are in the interval [0,1], the two segments intersect, and they do so at the point you find by substituting either of the two parameters t or u into its own parametric line equation

#

@timid kiln

steady basalt Jun 27, 2022, 7:00 PM

#

bold timber Well yeah, I think I will get rid of my curiosity for that case now because I st...

Work with the underlying dictionaries and arrays

#

You will find answer