#linear-algebra
2 messages · Page 1 of 1 (latest)
why not both?
ok i worked out some concrete examples in R^n and i think I understand now
When you choose an orthonormal basis of course B = I because each b_ij of B is the inner product of the ith and jth basis vectors
That makes a lot more sense now
Thank you so much for your help!
Okay so I just finished a course this semester which was multivariable calc and series. I cant lie I got fucked a bit but yeah.
To you guys, how was the difficulty of linear algebra compared to multivariable?
Depends on what type of linear algebra but general (computation based) linear algebra I found much easier than multivar
interesting
I mean i know both courses where I go are hard, and the multi course I took was 50% multi and 50% series which really really sucked cause it was a ton of material in a short time
but idk im hoping linear is a bit easier for me to grasp
If it's more theoretical it may be a little difficult but that is what this server is for
you can always ask for people to help contribute to your intuition
Yeah I suppose. Im bored so im going over some linear material rn
before the next semester
Very good 
we used this resource partially for multi so hopefully its good for linear https://cruzgodar.com/teaching/notes/linear-algebra
Looks pretty standard based on the toc
Seems pretty sparse in terms of explanations (outside of chapter 1), it'll be important that you make sure to do at least some exercises each time they appear
yeah
usually the prof would have reading from this then problems from another textbook
Found this one
looks a bit better
@hardy adder any knowledge of the emory one i sent? or thoughts
Not particularly other than it's more standard
You'll probably find it nicer to work out of
Especially for self-study
I think the explanations are much clearer and the pacing more reasonable
as compared to the first
I much prefer the presentation of vector spaces in this one to the other
Although at the cost of a little generality
Much more concrete (less abstract)
The honor is yours :3
oop
👀
Why did you open a separate linear algebra channel
lol
enough people were asking for it
lobbying always works
u can't ban me

Thats what seperates good lobbists from bad ones 
get out before 
Colen:

how calculate determinant
cool channel
oh cool
is this advanced maths? o:
so this is calculus 2 right?
calc 1.5
I did group theory before limits 
i did calculus b4 limits
uwu new linear channel
lol
u w u
no channel description ‼
yay now i know 'advanced maff'
Wow I never realized how much I needed this until I got it
see what good happens when woog listens to a wild Colen 
so, reminder to everyone: cayley-hamilton is magic
Yes of course
imo it makes sense, but perhaps I’d throw it into the other category
some linalg is often taken e.g. in high school or by non-math majors who may not care about e.g. group theory
it makes sense to separate it out imo
Is LA really higher level math tho 🤔
So the tricky thing here is that there's two things you can mean when you say linear algebra
You can mean, how do I compute matrix stuff?
Which would be under the low math category
Or you could be doing more theoretical stuff, which probably would technically fall under abstract algebra
I feel like since even theoretical linalg is fairly low-level on the grand scale of things it’d make sense to
a) have it in this channel
b) move this channel to the MATHEMATICS category
I mean you have number theory there too, and stats, both topics that at least where I am are largely university topics
loo
I'm doing up to Linear algebra and differential equations in high school
So that's pretty guud
sure, I don't mind moving it to the other section.
I wasn't too sure myself.
moved @broken hawk @trim ermine @proper crescent @dreamy fiber
Hey baby
ok I just deleted the channel description lol
#MakeLAHigherMaffAgain

We've been accustomed to the LA being in the higher math category for god knows how long now...
exactly its been like it for i'd say at least 3 hours, this change is unacceptable @jagged pendant 
A change like this would wreak havoc upon our already-established socio-economic structure...
One does not simply break tradition
Is this a new channel?
Step 1: Make a matrix mapping our anger at LA being removed from higher-mathematics to the complex plane
sorry I can only make woogian matrices
Nah, it should be a power series
smh
consider: power series with matrix entries
wth

Move linear algebra above Calculus dammit
it… is?
f* and f Are endomorphisms
sure
And its Matrix is the transpose of the f one
f* is the dual of f I assume?
Yeah
🆙 | Mintman agent 47 leveled up!
$\langle f(x, y \rangle = \langle x, f^*(y) \rangle$
mniip:
I was trying to make sense of this
In the scalar product something was missing
But this i did demonstrate
The 3rd is the point where i am stuck
$\langle f(x), y \rangle = \langle x, f^*(y) \rangle$
Mintman agent 47:
okay, consider $y \in \ker f^*$
mniip:
This $Ker(f*)=(Im(f)T)$
Mintman agent 47:
for all x, $\langle f(x), y \rangle = \langle x, f^*(y) \rangle = 0$
mniip:
Yeah
that is, $\forall z \in \operatorname{Im} f, \langle z, y \rangle = 0$
mniip:
that is $y \in (\operatorname{Im} f)^\bot$
mniip:
Yup!
this argument works both ways
so we've proved 3.1
for 3.2 you can use 3.1 with a bit of algebraic manipulation
$(\ker f)^\bot = (\ker f^{**})^\bot = ((\operatorname{Im} f^)^\bot)^\bot = \operatorname{Im} f^$
Is the demonstration rigourous
mniip:
?
which demonstration
3.1
which part are you doubting?
mniip:
that's pretty normal
I said the argument works both ways
Oooh
the other direction is just the same steps in backwards order
I could place <=> signs at each step to make it rigorous
Neat proof ?
short and elegant
Panda!
Pandou
Pandi
@barren plank i'll see
I am already struggling with 3.2 too
$$\langle AX, Y \rangle = \mathrm{Tr}( (AX)^T Y) = \mathrm{Tr}(X^T (A^T) Y)^T) = \langle X, A^T Y \rangle$$
Astréas:
A = Mat_B (f)
Je pouvais faire avec les matrices ?
Oui
Pck j'ai fait la 1.a avec les matrices mais pour ça j'ai pas osé
Oui on a le droit
yuck, why do you have to introduce a basis in this otherwise completely universal theorem
You can always introduce a basis since we are working in a finite dimensional space
yea but that's kinda ugly to be honest
I dont Even know if it's the same name in american
what is?
$$y \in \mathrm{Ker}(f^) \Leftrightarrow \forall x \in E, \langle x, f^(y) \rangle = 0 \Leftrightarrow \forall x \in E, \langle f(x), y \rangle = 0 \Leftrightarrow y \in \mathrm{Im}(f)^{\perp}$$
Astréas:
Inner Product Space
either that or euclidean space, yes
3.2 is the same ?
3.2 is slightly different, I haven't looked at how to prove it in terms of points yet
Seems not intuitive imo
it's probably the same
But i am crap so
I learned euclidean space as real inner product space (as opposed to unitary for example, which would be complex inner product)
“them” being what?
me? I'm 20
I think Inner product space is meant to be a general space with an inner product. Euclidean means the dimension is finite
At least that is the difference once translated in french
Panda have you checked 3.2 ?
La 2 cest f = f**
avec un orthogonal en plus
$$\mathrm{Im}(f^*) = \mathrm{Ker}(f^{**})^{\perp} = \mathrm{Ker}(f)^{\perp}$$
D'après la 3a)
Ah ok!
Astréas:
I think Inner product space is meant to be a general space with an inner product. Euclidean means the dimension is finite
idk about the dimension but I’m pretty sure euclidean implies that it’s a space over ℝ (and not ℂ or 𝔽₂)
Oh yeah sure
But I think finite dimension is also underlined
Function space are not called Euclidean, I think
Et pour la dernière
$$\langle x, f(f^(x)) \rangle = \langle f^(x), f^(x) \rangle = | f^(x)|^2$$
Astréas:
Whut ?
Donc si x est dans Ker(ff*), il est aussi dans Ker(f*) vu que le produit scalaire cest 0 scalaire x qui est nul
Et le dernier une inclusion suffit, on peut conclure ensuite par dimension
Juste comment tu justifies le passage de la première à le deuxième inégalité ?
cest la définition de l'adjoint
Mais non on a pas l'adjoint c'est Hp
$$\langle x, f(y) \rangle = \langle f^*(x), y \rangle$$
Astréas:
Je pense que le but de l'exo c'est de nous faire voir les propriétés de l'adjoint mais je peux pas vraiment admettre un truc HP si ?
Cest pas admis
Cest défini juste au dessus ...
Tas le droit de changer le f de place à condition de lui ajouter une étoile ...
Et vu que l'ajout de deux étoiles cest lui retirer l'étoile 😄
C'est la combinaison de la formule de 1b) et de la troisième propriété de 2)
Est ce que si f est symétrique alors f* est antisymetrique ?
Non
Enfin la mat
Non plus
Ok d'acc
Si la matrice est symétrique, sa transposée est égale à elle même
Mdrr et la 4b c'est n'importe quoi les hyperplan c'est pas au programme aussi
Ah ouais je suis con...
4b....
Ouahhj
Un hyperplan cest un sous espace de dimension n-1
C'est la définition complète ?
En dimension finie, ca suffit comme definition
what’s it you have to show? λ is an eigenvalue, show that E(λ)^⊥ is a subspace?
unless I’m missing something it won’t always be one of dimension n-1, since E(λ) could be of any dimension ≤ n
but if that’s what you have to show, that’s just the same proof as showing that S^⊥ is a subspace for any set S
which would just go:
let v, u be orthogonal to any element in S. pick s∈S arbitrarily. Then ⟨v + αu, s⟩ = ⟨v, s⟩ + α⟨u,s⟩ = 0, so v+αu is also orthogonal to s. but since s is arbitrary, it is orthogonal to all elements in S. Therefore, S^⊥ is a subspace
what is vect(U)
ah okay, yea, so you really only need the propery that Span(U) is a one-dimensional subspace then
that it’s an eigenvector etc is irrelevant
do you have proven the theorem that if W is a subspace of V then
V = W ⊕ Wᵀ?
or, at least that dim(W) + dim(Wᵀ) = dim(V)?
really, you need to prove two things:
- span(u)ᵀ is a subspace
- dim(span(u)ᵀ) = dim(V) - 1
Yeah this theorem has already been demonstrated in class
okay so you have 2) already then
Done!
so you just need to show span(u)ᵀ is a subspace, which I’ve demonstrated above how to do
Feel delighted man
I dont need to dem that span(U)t is a subspace no?
Oh yeah an hyperplane needs to be a subspace and have a dim =n-1
yes you do, that’s step one
Done thanks!
If i do the QR decomposition of A=VP. And then the QR decomposition of transpose(P). I will obtain A=VMU. Is M diagonal? Will i obtain the singular value decomposition?
can someone explain to me eigenvalues and eigenvectors?
A fairly long conversation of me explaining the concept. Feel free to come back and ask!
there are some great courses about linear algebra on youtube, 3blue1brown, maththebeuatiful and other professor I forgot the name but he looks kinda like the flex tape guy
I'd recommend all
I missed our last lecture aboutthe "festlegungslemma" which states that an isomorphism of vectorspaces is sufficiently described by the images of the basis vectors. But neither on google nor in the books can I find this lemma, can anyone tell me how it's actually called / link to an an article about it?
linear transformations are described by the images of the basis vectors
This is because of the linearity of the transformations and the fact that any vector in the space can be written as a linear combination of the basis vectors.
yup I know, I would like to read up on it though, do you know the name by any chance?
"Change of basis"? There isn't really a name for this fact.
It seems to be pretty important though, I can't find it anywhere
But thanks for trying to help :)
I‘ve never heard of that and I did linalg in german
it is very important though, yes
wat
pretty easy to see too:
let’s say (v1, v2… vn) is a basis. then there’s a unique way to write any v∈V as a linear combination of the basis elements and so you can write
T(v) = T(Σaᵢvᵢ) = ΣaᵢT(vᵢ) by linearity
qed
there’s not much more to it, maybe do some exercises to convince yourself of its truth
Anyone who can clarify a few basic concepts for me? 😃
Is it understood correctly that we have N observations, each of which consists has its own M dimensional vector?
Also, is there a specific reason that vector x is transposed, and all elements of vector X are transposed? 😃
First of all, transposing a column vector basically gets you the same vector, but written as a row vector; and I think this is pretty clear to you, since it's not part of your questions.
Now, I'm pretty sure it's standard notation to consider vectors as column vectors by default. Now look at the way the matrix X is defined: each row of the matrix must be a row vector x_i; but since vectors are all column vectors by default, you need to transpose it first, if you want to fit it in the matrix that way.
When it says that x = [x_1, ..., x_M]^T, your book is basically saying what I said above, that is that all vectors are written in column form by default: in fact, if you transpose the vector on the right side of the equality, what you get is a column vector. Writing [x_1, ..., x_M]^T, or writing [x_1, ..., x_M] vertically, is exactly the same thing.
So it is correctly understood that an entry in X i.e x_1^T, corresponds to a row vector x?
and the reason that x = [x_1...x_M]^T, is because we by default write vectors as column vectors, and thus it is equivalent?
Well... I wouldn't say that an entry of the matrix X is a row vector: remember that a matrix is ultimately an array of numbers, so each of those numbers are the actual entries of X. It's just that, when you look at the rows of this matrix, they happen to be the row vectors x_i that were discussed earlier (because this is how you built the matrix X, after all).
The reason why you find x_1^T, x_2^T, ..., in the definition of X instead of just x_1, x_2, ..., is that, as you correctly say, these would be column vectors by default: and stacking column vectors vertically... well, that would basically give you a longer column vector, and certainly not the N x M matrix you want to define instead. So first you transpose them so they can be written horizontally, then you stack them one on top of another... and bam!, that's how you get the matrix X.
Ok, thanks, got it now
yo guys i got that really quick question
if i got a matrix lets say 2x2 matrix to make it simpler, and it has 2 eigenvectors
and i want to find the basis for the eigenspace
do i use both the eigenvectors or just one of them?
depends
each eigenvalue has its own space
so e.g. if they're eigenvectors with different eigenvalues, then each will be a basis vector of its own space
and there are two eigenspaces
but lets say they want me to find A=PDP^-1
but if they have the same eigenvalue, you have to check if they're independent, and if yes, use both
oh, yea, you need to have a separate, linearly independent vector in each column of P
yeah excatly
each corresponding to its value
can i take 1 from the first lamdba and one from the second?
so if you have twice the same value, then you put it there twice in D
and build a basis of those two?
but if they have the same value u cant build a basis in R^n
but you need to make sure that the vectors are independent
it is entirely possible for a matrix to not be diagonalizable btw
yeah i know
but assuming there are n linearly independent eigenvectors
im reading about it right now, i used eigenvalues to try find a space, but the vectors wernt linierly independtent
you put those in P, and then write the corresponding values into D
yeah i know that, but if im trying to find the vectors
am i allowed to take some of them from one of the lamdba
and some from the other lamdba?
you will have to
i always have to ?
lemme make a concrete example
like lets say my eigenvalues are (L-4)(L-3)
A-(L-3)I
and then try to find vectors
and then i go A-(L-4)I
and then find one vector?
combine those and thats my P?
the 3x3 matrix A has eigenvalues 1 and 2.
let's say rank(A- 1I) = 1. So you know there are two lin. indep eigenvectors here, and you need both. And rank(A-2I)=2, so you need to find one here
whats rank :O?
and the diagonal matrix will be (1,1,2)
im stil lhella new to this m8
...how are you looking for eigenvectora and don't know what the rank of a matrix is?
well im studying in swedish
thats's an incredibly weird progression
rank of a matrix = how many independent columns it has
yeah im missing alot of words
but yea you need to find as many vectors as you're missing to get full rank. e.g. if you have a 4x4 matrix and rank(A - λI) = 2 then you have to find two vectors
then there was only one eigenvalue to begin with
i see
ive been doign 2x2 matrix
so that explains why ive only been needing to use one eigenvalue
yea there the options are:
-two values, one vector each
-one value with two vectors
-one value with only one vector -> not diagonalizable
np
offtopic, i havnt played any KH game
but this song is fire
Face My Fears Lyrics, Skrillex, Utada Hikaru Join the Notification Squad! Click the 🔔Bell. ✘ Spotify: http://spoti.fi/Proximity ✘ Facebook: http://bit.ly/FBP...
off-topic is best topic tbh
Anyone familiar with PCA? 😃
@mellow hull I know a bit about it
Same
hey guys
how can i compute 3D unit vector using two angles ?
horizontal and vertical
can someone give me the formula ?
In this section we will define the spherical coordinate system, yet another alternate coordinate system for the three dimensional coordinate system. This coordinates system is very useful for dealing with spherical objects. We will derive formulas to convert between cylindr...
That's the common "math" way to do it, but it might not be what you're looking for if you're a programmer. Let me know.
i am programmer
what is the rigorous definition of rank?
The dimension of the subspace the column vectors span
and what is span
All linear combinations of a set of vectors
The span of a set of vectors is itself a set
My book defines rank as the integer r that satisfies the conditions 1) The matrix A has a minor of order r which does not vanish. and 2) Every minor of the matrix A of order r+1 and higher (if such exists) vanishes
Matrix A is just a general matrix of size nxm
I mean that's not wrong, but it's also not very enlightening
rank of a matrix = number of linearly independent columns = number of linearly independent rows = dimension of the span of the column vectors = dimension of the image of L_A, where L_A is the linear transformation given by left multiplication with A
each of these is a sensible and rigoeous definition of rank
🆙 | Sascha Baer leveled up!
@broken hawk can also relate it to the number of features under PCA
for the stats boyos
I have no idea what that acronym stands for
Principle Component Analysis
https://en.m.wikipedia.org/wiki/Principal_component_analysis
Principal component analysis - Wikipedia
meh all it does is find the eigenvectors
okay, I also have no idea what that concept is ^^ is it in any way related to singular value decomposition?
https://en.m.wikipedia.org/wiki/Singular_value_decomposition
Singular value decomposition - Wikipedia
yea I know what the svd is
oh yea it says here “the principal components transformation can also be associated with the svd”
ya its all related
its when u dont have a square matrix or have sparsity that there become distinctions
svd works for nonsquare though?
they are basically the same thing cept U can be unnecessarily large in svd because of the latter
u can drop the unnecessary columns tho i think
hey guys , so i am a programmer and i use
this equation
this.direction = [
Math.sin(vAngle) * Math.cos(hAngle),
Math.sin(vAngle) * Math.sin(hAngle),
Math.cos(vAngle)
];
but no matter what is my input
it outputs values that belongs to the positives x , y ,z 3D space
shouldnt it output a values in negtive if i input values larger than 90 ?
Cool, glad to see that equation worked for you! Your programming language likely works in radians, not degrees.
Check to see if there's a degree version of sin and cos, or multiply vAngle and hAngle by 2π/360 before using them
i convert them in radins in the lines before this
vAngle *= glMatrix.toRadian(vAngle);
hAngle *= glMatrix.toRadian(hAngle);
tbh i think the equation is fine and the code is also fine
but i think i dont know how the view matrix works
which might be what is causeing the problem but i see it normal
The view matrix?
Camera Matrix i its 4*4 Matrix that is used to transform the whole 3d space from the world space to the camera space
its like moving the world around the camera not the other way
Use quaternions
i tried to but keeping track of this extra vector in my code might cause me some troubles later
idk
Use a matrix.
quaternions are dank af
dont know much about them but ive heard they are computationally tedious
Easier than matricies, but can do less than matrices. Matricies usually win out
how was determinant discovered? especially for generalized nxn
or it is just randomly thought up rules that seems to work
Note the denominator of any system of equations is the determinant of the matrix. I imagine they played with that for a while, and it evolved from there
I mean it's easy to see for 2x2, but how did they extend that to nxn
The déterminant of nxn is the alternating sum of the elements in column p times the determinant of the (n-1)x(n-1) matrix formed by all the elements not in the row or column of p
hello guys, All square matrices without a zero row are invertible, this 'zero row' is referring to a row consisting of entirely zeros right?
yeah
All square matrices without a zero row are invertible? That's incredibly false.
yeah, I am supposed to evaluate if its true or false
$$\left[
\begin{array}{cc}
1 & 1 \
1 & 1
\end{array}
\right]$$
TendentiousTorturousTopics:
What about this? I know that statement 3 is definitely wrong, not sure about 1 and 2
A^T + B^T = (A + B)^T, and the inverse of a transposed matrix is the transposed of the inverse
So 2 is definitely true
I'm wondering if there's a counterexample to 1
Oh yeah here's an easy one
ohh yeahh, (A^T)^-1 = (A-1)^T
Take the 2x2 identity matrix and split it into two nonzero (lower triangular) matrices A and B that each contain only a single 1, the other entries being 0
Then both A and B have determinant 0, thus not invertible
1 false, 2 true, 3 false
thank you! @empty copper
Np 🍮
I'm taking this course https://courses.edx.org/courses/course-v1:UTAustinX+UT.5.05x+1T2019/course/
@wintry steppe You here?
that doesn’t look like linalg to me
It's not
I mean it is but it’s not what I’d expectg to be taught in linalg
but rather in like,well, (the high school version of) algebra
it annoys me that there’s two entirely distinct subjects typically called algebra
I barely even see how solving equations for x and group theory is related at all
like sure it’s all connected by stuff like abel-ruffini theorem etc
but high school algebra should prolly be called sth like “intro to equations” or “intro to variables”
um excuse me?!!??
x + 2 = 7 is OBVIOUSLY as complicated as inversing transposed matrices
You may think
x = 5 because 7-2=5
but how do you know
the funny thing is that’s not even a linear equation
you don't know what you even know
oof
Hello, could someone show me an example using numbers of finding the L2 norm? Meaning if we had two points, say (3,2) and (4,7), how would we calculate the L2 norm of these points? I get how to calculate the L2 norm of just the x, or just the y, but how about both? Thx
TendentiousTorturousTopics:
In this case, you have a norm on elements of the vector space $\mathbb R^2$, so you have a metric, i.e. Euclidean distance, taking in two points.
TendentiousTorturousTopics:
but your question isn't mathematically well-formed, since you're saying something like calculating the L2 norm of two points.
Hello sorry I am not discribing it properly
describing
So a bit more background
Lets say we have a d-dimensional multivariate Gaussian distribution (X1....Xd)
with a mean vector u in R^d and covariance matrix Rdxd
where u subscript i denotes the ith element of u, and summation ij to denote the element at i'th row and j'th column of summation. (summation is the variance here I guess?)
is it making sense so far? @wintry steppe
wait what does that have to do with calculating a norm?
Ok so here is that part lol
Let x, y ∈ Rd be two independent samples drawn from N (µ, Σ). Give expression for E�x�2
2
and E�x − y�2
2. Express your answer as a function of µ and Σ. �x�2 represents the ℓ2-norm
of vector x.
not of summation; $\Sigma$ means a covariance matrix in this case.
the question marks are Ellxll2
TendentiousTorturousTopics:
Yeah sorry i am having ahard time describing it because i dont understand the question well
I know the basic l2 norm vector but i think the question is a bit more complicated, not sure how to approach it
i guess the covariance only comes into play when we are looking at x and y
Ellxll2 can probably be calculated without that right
it's asking for $\mathbb E[||x||^2_2]$ and $\mathbb E[||x-y||^2_2]$
🆙 | Shocks leveled up!
yes
sorry i cant post the signs properly 😦
That is correct though
well
actaully
It should be squared as well
squared of each of them
ah I was a bit confuddled by it not being squared
Ellxll2^2
that would be something hard to calculate
;P
TendentiousTorturousTopics:
So lets start with the first one, just the x one
okay
Yep!
so you recognize that the squared l2 norm is just the sum of squares of the components, right?
Yes
expectation is linear
i am getting confused with the x,y and dimensions etc
so it's just the sum of expectations of the squares
Is this just the norm of two points?
or a vector with 2 values
I am just having trouble understanding what exactly the norm is, i know its a magnitude
the L2 norm of a vector is defined as $||x||^2_2 = \sum\limits_{i=1}^n x_i^2$
TendentiousTorturousTopics:
I know if we had (0,1,3) it would just be sqrt(1+9)
Ok great, so in that picture you posted, x itself is just a vector
TendentiousTorturousTopics:
well there are two parts, the first one is just x, the second is x-y
yes
so what does the y represent? simply the y coordinate of the two points?
ah hmm..
and then you get a quantity z=x-y from them
and it's asking what the expectation of the squared l2 norm of z is
Let's say you draw two samples (0,0,0) and (1,1,1) from the distribution
ok so this is a 3 dimensional distrib
then the value of ||x-y|| = sqrt 3
is x your first point and y your second?
I said that we sampled x and y from the distribution
ok, let me ask a very dumb question here haha
It doesnt tell me how many dimensions this is
simply d-dimensional
so here x,y doesnt mean x coord, y coord, it just means two diff samples
so x could be (3,6,1) or (1), it doesnt say and doesnt matter, correct?
no
so x and y represent two different vectors on a plane
and the dimensionality is unknown
the dimensionality is determined by the dimensionality of the covariance matrix
yeah it doesn't matter in the end
ok great, i was confusing it because i kept thinking (x,y) were points
when in reality they are vectors
so further than that, now if we speak about the covariance matrix, which is just how x varies with respect to y right?
well..
Actually no forget that
Covariance would be when one dimension changes, how it affects the others?
covariance matrix is defined as $\Sigma_{ij} = \text{Cov}(x_i, x_j)$
TendentiousTorturousTopics:
where $x_i$ and $x_j$ are the ith and jth components of the vector, respectively
TendentiousTorturousTopics:
if its a vector, how does it have row and column?
shouldnt it just have one row and many column
or one column and many rows
sorry i know these are silly questions
but im just trying to understand the problem and not very familiar with LA
a vector just can be indexed
I know in statistics a covariance would be how one variable changes with respect to another
A vector-valued random variable is just a measurable function from the sample space to some space R^d
which really means that a vector-valued random variable with dimension 1 is the same as a scalar random variable
where d represents dimensionality, and R is just the space itself?
yes
that makes sense
In this problem, we dont know the dimensionality as we said before, so these vectors are drawn in a multi-dimensional field. But if they are vectors, they only change on one plane?
yes d is the dimensionality
it turns out that in general, things will depend on the dimensionality
but if you solve this problem, you'll find a nice expression at the end that only sorta depends on the dimensionality
in that you don't really need to write the dimensionality in there
Yes youre correct, im just trying to envision this problem and what it looks like 😃 But ok just for the sake of an example im going to assume its a three dimensional space
So in this three dimensional space, we have two vectors, x and y
correct so far?
yes
awesome, so we want to first calculate the l2 norm of x, squared
so if the x vector had the coordinates (3,6,-1) the l2 norm would be sqrt(3^2 + 6^2 + -1^2)
then if we square that, it would get the l2 norm squared?
id try to use the LaTeX but it would take me a while and i dont want to make you wait
yes
so the squared l2 norm of a vector is just the sum of squares of its coordinates
yes
However, this problem wants me to express it as a function of the mean and variance. So in this case, the mean
what does that mean? no pun
How do we take the mean of a vector
Or how is that possible I guess
It wants you to express $\mathbb E[||x||_2^2]$ as a function of $\mu$ and $\Sigma$
TendentiousTorturousTopics:
of course you can take the mean of several vectors
yes, of several
just like how you can talk about expectation of a vector-valued random variable
x is a sample of a random variable
we can certainly talk about $\mathbb E[X]$ if $X \sim N(\mu, \Sigma)$
TendentiousTorturousTopics:
right?
so we have a normal distribution of this random variable
so what is this X vector made up of? we just took a sample three timesfrom this RV?
and the Y vector is another sample of three from this RV?
hm
or maybe X is a random sample of one RV
pretty much
no x and y are samples of the same R.V.
no
the distribution is of vector-valued r.v.s
so we can assign a density $f(\mathbf x)$, where $\mathbf x$ is a vector
TendentiousTorturousTopics:
when we take a sample, we're sampling from a vector-valued distribution
I suppose you can think of it as sampling d random variables $x_1, x_2, x_3, ..., x_d$, where each is distributed normally, but then they're not independent; they have covariance with one another
TendentiousTorturousTopics:
where x1, x2, x3 are different RVs but do affect eachother
i just keep getting so caught up on where i think x1, x2, x3 would be the i'th pull from the same variable
i kind of get the stat part, but the LA part with the matrices just throws me for a loop
so X is made up for three normally distributed random variables that are not independent of eachother
🆙 | Shocks leveled up!
and Y is made up of the same three normally distributed RVs that are not independent of eachother
but with diff values
each RV represents one dimension
yes
awesome I got it now
ALRIGHT
So
this first Vector X
we will be computing the L2 norm squared
so its simply X1^2 + X2^2 + X3^2
and for the L2 norm of X - Y squared it would be (X1-Y1)^2 + (X2-Y2)^2+(X3-Y3)^2
yes
great, it makes sense 😃 So then how do I express them as a function of the mean and covariance
mu and summation
er
sigma
so $\mathbb E[||X||_2^2] = \mathbb E[X_1^2 + X_2^2 + X_3^2] = \mathbb E[X_1^2] + \mathbb E[X_2^2] + \mathbb E[X_3^2]$
TendentiousTorturousTopics:
yep
We have $\mathbb E[X_1^2] = \text{Var}(X_1) - \mu_1^2$
TendentiousTorturousTopics:
Ahh, yes
so $\mathbb E[X_k^2] = \Sigma_{kk} - \mu_k^2$
TendentiousTorturousTopics:
the expectation of the variable squared, is the variance of the variable minus the expectation of the variable, then squared
and therefore $||X||2^2 = \sum\limits{i=k}^d \mathbb E[X_k^2] = \sum\limits_{i=k}^d \Sigma_{kk} - \sum\limits_{i=k}^d \mu_k^2$
TendentiousTorturousTopics:
youre so good at LaTeX
im going to try to write my hw in LaTeX you inspired me haha
$\sum\limits_{i=k}^d \Sigma_{kk} - \sum\limits_{i=k}^d \mu_k^2 = \text{tr}(\Sigma) - ||\mu_k||_2^2$
TendentiousTorturousTopics:
and you're done
$\Sigma$ is a dxd matrix
TendentiousTorturousTopics:
For example, if you want the standard Gaussian distribution, we set $\mu = 0$ and $$
\Sigma = \left[
\begin{array}{ccc}
1 & 0 & 0 \
0 & 1 & 0 \
0 & 0 & 1
\end{array}
\right]
$$
TendentiousTorturousTopics:
and the trace is the sum of the elements along the diagonal
$\Sigma_{ij} = \text{Cov}(x_i, x_j)$, so if $i=j$, then $\Sigma_{ij} = \text{Cov}(x_i, x_j) = \text{Cov}(x_i, x_i) = \text{Var}(x_i)$
TendentiousTorturousTopics:
and that last statement is for our first questio nright
where we just use X
err wait..
no.
so i and j represent the rows and columns right
of our matrix
or mayb enot
yes
we had x1 x2 and x3
i is the ith row, and j is the jth column
no
the matrix is always 2 dimensional
$$\Sigma = \left[
\begin{array}{cc}
\text{Cov}(X_1, X_1) & \text{Cov}(X_1, X_2) \
\text{Cov}(X_2, X_1) & \text{Cov}(X_2, X_2)
\end{array}
\right]
$$
TendentiousTorturousTopics:
that's for 2-dimensional vectors
so it just becomes sigma = Cov(X1,X2) * Cov (X2,X1)?
maybe its just me, but it feels so much harder when they dont specify the dimensionality
i know it doesnt matter
but if we have a ton of dimension, wouldnt there be a lot more variables
and more Covariances?
yes
the number of covariances is d^2
Sigma is a matrix so that's what sigma is equal to
nah
Great thank you so much, this part doesnt look as difficult and ill attempt to answer it first
Find the distribution of Z = αiXi + αjXj , for i ∕= j and 1 ≤ i, j ≤ d. The answer will belong
to a familiar class of distribution. Report the answer by identifying this class of distribution
and specifying the parameters.
alpha represents this parameter im assuming
so each dimension has a parameter
drawing a bit of a blank, I know for a normal distribution it should be something like aX + b right? not sure if that applies here
I know all of our RVs are Normal
I'm a bit sleepy right now, so I'm also drawing a bit of a blank
Expectation comes linearly
Ah its ok, i can try to figure it out on my own and if i run into a wall ill post it again tomorrow 😃 But i would assume the sum of two normally distributed variables would also be some ort of normal distributed variable
not 100% sure what the question is gettin at but yeah
Sum of normally distributed random variables
This means that the sum of two independent normally distributed random variables is normal, with its mean being the sum of the two means, and its variance being the sum of the two variances (i.e., the square of the standard deviation is the sum of the squares of the standard deviations).
ehhh sum of two normally distributed random variables is only normal if they're independent
ah
True..
Found this piece
If they are dependent you need more information to determine the distribution of the sum.
If X
and Y are iid and X+Y and X−Y are independent then X and Y are normally distributed (and then so are X+Y and X−Y
).
If X
and Y form a bivariate normal distribution, then their sum is normal. This implies that the conditional distribution of Y given X is normal, the regression of Y on X is a straight line, and the variance of Y conditional on X does not depend on X. Similarly for the distribution of X given Y.
for whatever thats worth haha.
oh if X and Y form a bivariate normal distribution, then their sum is normal. They do form a bivariate normal.
How does this alpha parameter come into play ? Their coefficients
or whatever it means by specifying the parameters
the alpha just scales it
scaling a normal just gives another normal with mean $\mu\alpha$ and variance $\sigma^2\alpha^2$
TendentiousTorturousTopics:
well, I suppose they're bivariate normal
so the covariance matrix looks like: $$
\left[
\begin{array}{cc}
\Sigma_{ii}\alpha_i^2 & \Sigma_ij\alpha_i\alpha_j \
\Sigma_{ji}\alpha_j\alpha_i & \Sigma_{jj}\alpha_j^2
\end{array}
\right]
$$
TendentiousTorturousTopics:
why is stats tainting my precious linear algebra channel 
because vectors?
the variance of the sum is actually just the sum of all of the entries in the covariance matrix lololol
dont worry i will have more linear algebra questions tomorrow 😃
help pls
I know it cant be C because there are two y values for x = 0, hence its not a function
but what about between A and B?
Hmm i think it's A because there are many points of y=0 in B, which makes it a higher degree polynomial. I'm sorry if I'm wrong I'm not good at this.
I also think so. I think it must make U turns many times to hit those points, so it has to be higher degree.
@broken girder weirdly, C contains two points having the same x coordinate: (0,0) and (0,-1)
so there's no polynomial that goes through all points of C
so there are two things:
- either we forget about C because it can't be interpoloated
- we consider the interpolatng polynomial of C{(0,0)} or C{(0,-1)}. Both would have the same degree: 2
if we consider 2), the answer would have to be C
but maybe cuz they say "choose the set that can be interpolated"
we'd have to consider A
oh sry I didn't see you already said "I know it cant be C because there are two y values for x = 0, hence its not a function"
hmm then yeah I agree with cat-lover: A
the idea is indeed what he said, but we'd have to "prove it" I guess
well A has 5 points so the interpolating polynomial would have degree <=4
B has 6 points so the interpolating polynomial would have degree <=5. let's call it P(X)
the thing is
as Cat-lover said
there are points having the same value
this gives information, by Rolle's theorem, that the derivative P'(X) equals 0 somewhere there
ok -2 and -1 have the same value, 0
thus P'(c_1)=0 for some c_1 between -2 and -1
same argument we get c_2 between -1 and 0, c_3 between 0 and 2, and finally c_4 between 3 and 4
notice that Rolle's theorem says that c_i is inside the open interval between the two points
this shows that c_1,...,c_4 are all distinct points
hence P'(X) has at least 4 distinct roots
so this really proves, without calculations, that A is the correct answer.
yup thats a really rigorous proof, thanks
It's a pleasure :)
√(λ(M^2)) is always real for any
i)real matrix M
ii)real, symmetric matrix M
Could someone help prove why either or both are true?
Here, λ represents Eigenvalues.
I think part i is true, and part ii isn't.. for one, part i can have a rectangular matrix but for part ii, since it says symmetric matrix, it must be square.. Am I thinking straight?
if part i is true, so is part ii
M² only makes sense for square matrices in the first place
since the product of an n×m matrix with itself only makes sense when m=n
okay, so first of all to rephrase this, the question is whether the eigenvalues of M² are nonnegative.
a matrix has nonnegative eigenvalues if it’s positive semidefinite
if it is positive semidefinite, then vᵀAv ≥ 0 for all v
so vᵀMMv = (Mᵀv)ᵀ(Mv) ≥0
if M is symmetric, then this is the dot product between Mv and itself, so it is ≥ 0 (and 0 iff Mv=0)
so (ii) is true
as for (i) I believe a rotation matrix by 90 degrees should be a counterexample
its square will have eigenvalues -1
what does MvM mean?
that’s a fundamental property of the dot product
since it is an inner product
v·v ≥ 0, and =0 iff v=0
right, I can't think of a counter example to that so it is true.
you could show (ii) a bit differently too:
since M is symmetric, it is diagonalizable with real eigenvalues (spectral theorem). write M = QΛQ⁻¹. Then M² = QΛQ⁻¹QΛQ⁻¹ = QΛ²Q⁻¹, where Λ² now is a diagonal matrix with nonnegative values
secondly, I didn't understand what you did for part i there
so M² is diagonalizable and has only nonnegative eigenvalues
why should rotation matrix by 90 dgrees squared have eigenvalue -1?
yeah, got that part
may I just say proof left as an (easy) exercise to the reader?
but the basic idea was that a rotation matrix by 90 degrees is not diagonalizable (over ℝ) and so the argument about the diagonal matrix won’t work. further, that matrix is essentially analogous to i, and i² = -1
squaring it will straight up give you the negative identity
which is trivially a matrix with negative eigenvalues
but umm
oh right
negative eigenvalues rooted would be complex
not real
so i is false
yea, basically instead of thinking about real/complex roots I found it easier to just think about positive/negative values ^^
for (ii) I’d say the diagonalization proof is nicer, but it requires heavier tools
the fact that the dot product is positive-definite is pretty fundamental, the spectral theorem (all symmetric matrices are diagonalizable) may not be
I see, I'm not comfy with spectral theo cuz I just did it a couple days back
oh actually it’s pretty easy to see that the dot product is positive definite
but I'll delve deeper and come back to this
yeah all elements squared and added
that’s v₁² + v₂² + …
each of those are nonnegative
oof svd
suppose, there wasn't SVD in there
1 can definitely not be true because M may not be diagonalizable at all
🆙 | Sascha Baer leveled up!
why is this bot a thing
if it was just √(λ(M^2)) = λ(M)
would the question change at all?
1 can definitely not be true because M may not be diagonalizable at all
Why?
rotation matrix isn’t diagonalizable
but has an svd
…isn’t diagonalizable in ℝ that is
it is in ℂ
but even in ℂ there are matrices you can’t diagonalize
while the svd always exists
oh right, orthogonal matrix can't have eigenvalue decomposition
is what you meant
right?
yea it can, identity is orthogonal
umm?
I simply mean that the matrix $\begin{bmatrix} 0&-1\1&0 \end{bmatrix}$ has only complex eigenvalues
Sascha Baer:
where are the complex eigen values?
so λ(M) = i, -i
I see 1 and 1 as its eigen values?
yea, diagonalization is this
why can't rotational matrix be diagonalised again?
think about it
what would it mean for a rotation matrix to have an eigenvector?
specifically in 2D
eigenvectors must remain parallel to how they started under the action
anything that is rotated is not an eigenvector
only stretching
except for those that rotate by 0 or 180 degrees
why is that?
but note: diagonalizable does not imply symmetric
I’ll let you think it through why those are symmetrical
as for the others, well, because they don’t have (a full set of) eigenvectors
so they can’t be symmetrical
because if they were they could be diagonalized
and also from a purely pragmatic perspective, rotation matrices always have a sin(θ) on one side of the diagonal and a -sin(θ) on the other
So any symmetrical matrix is a rotational matrix that rotates by 360 or 180 degrees, correct?
how did you come to that conclusion now?
wait
definitely not
the matrix that is all 2s is a symmetric matrix
and does not at all rotate ^^
they aren't cuz the columns aren't orthonormal
yeah well
does not at all rotate
is equivalent to
rotating by 0 or 360 degrees, no?
well, I mean, sure, but that matrix does other things
not all matrices are rotations ^^
I mean, would there be any counter example to the statement I made?
which one?
all symmetrics are rotational, by 0 or 180 deg?
any matrix that is symmetrical but not orthogonal?
the only rotation matrices that are symmetrical are the identity, and the identity but with two -1s somewhere on the diagonal
which correspond to 0 degrees rotation, and 180 degree rotation in some plane
as you can obviously see, most symmetric matrices are not those
sure is not, yea ^^
I don’t think you’re being pedantic in the right way
so that statement, as dumb as it sounds, is still always right, right?
in my opinion, that matrix does not rotate at all
as in
not “it rotates by 0”
“it does something which cannot be described as a rotation”
uhmm
I mean it moves all base vectors to the vector (2,2,…2)
so every vector did a completely different motion
how can that possibly be called rotation?
rotation is a rigid motion in a 2-dimensional subspace
but no 2-Dimensional subspace is even kept alive
I’ve seen two definitions of rotation


