#linear-algebra
2 messages ยท Page 288 of 1
Okay.
And D_ii = 0 if L_ii != 0
i made a counter example in octave rn
you can always construct B by taking QD
doesn't have to be QDQ^T
But QD is not symmetric?
Won't it break one of the requirements then?
Do we have freedom to choose Q, or is it given?
You can choose Q
you can't actually, but you don't need it
scaling not because Q is ortho
If Q is given, then I think we also need to allow D_ij = D_ji !=0 when L_ii and L_jj are both zero.
Or you meant sth else probably
D is diagonal
in my understanding the original problem is rather that
[A B] and [A; B] are full rank, A is symmetric
and the image of B is ortho to that of A
Yes
if you can avoid using the EVD of A, there is no need to use it
Oh, I guessed it was part of the question whether D would be forced to be diagonal.
I still don't understand the definition of B.
My guess was that B = QDQ^T, but that's just a guess
any matrix that makes the block matrices [A B] and [A; B] both have maximum rank
B is what you want
some square matrix the same size as A, apparently
B can at least be any invertible matrix, symmetric or not.
then criver added the constraint that the image of B is the ortho complement to the image of A
Ah!
B cannot be invertible if A is not zero
now if you have that you ALSO what the image of A^T to be ortho to that of B^T
then it will be symmetric (off the top of my head, at least)
but as it is, no
But A is symmetric
that's what i mean, criver
So don't the images match
what?
yes, that's exactly why i wrote what i wrote?
I am referring to
yes
Hmm, I'm not convinced. What if $A=\begin{pmatrix}I\&0\end{pmatrix}$ and $B=\begin{pmatrix}0\&E\end{pmatrix}$ as block matrices, where E is arbitrary invertible?
I am still trying to digest whether QD will work
Troposphere
yeah, this convinced me
not only that, but for this one, even if the images of the originals and transposes were required to be orthogonal to each other, it still wouldn't make B symmetric
so in general, no
more importantly
don't pick up the habit of just making up properties and not checking them
what do you mean?
The B = QDQ^T was a guess that works, but the converse is not true (i.e. that any B is of this form), is that the property you're referring to?
yep
yeah, it is not a good habit, I am often stuck trying to prove stuff that's not true thanks to it
i think it would be full row rank, obvious since column rank = row rank and rank(A,B) = min(rank(A), rank(B)) right?
is this true
What happens when $P^n = DQ^nD{-1}$ when $n \to \infty$?
ExtraterrestrialPigeon
$P^n = DQ^nD^{-1}$ when $n \to \infty$
elemen
well if the eigenvalues of $P$ have magnitude less than 1, then $P^n$ tends to 0
elemen
otherwise it's unbounded
I did a bit of clarification on A field has no divisors of zero. I already know it's referring to for any x, y in a field K, if xy = 0 then either x = 0 or y = 0. I have proven this. It's just the wording "has no divisors of zero".
I am thinking of it in the sense of elementary number theory and it's confusing me.
The usual term in English is a "zero divisor".
ah I see, thanks.
but $P^0 = SD^0S^{-1} = SS^{-1} = I$?
ExtraterrestrialPigeon
Are we saying that n is bounded $0 \le n \le 1$?
ExtraterrestrialPigeon
no, n tends to infinity. if the eigenvalues of P have magntiude less than 1 then P^n tends to 0
So something like $||x|| = \sqrt{Q}$ where Q defines some rational number?
ExtraterrestrialPigeon
Are there no bounds to this then, even if it's on an open interval as long as the magnitudes have a value less than 1 then it tends to 0?
Solving x' and y' for X and Y the solution for this problem is x'= Ysin + X cos ; y' = Ycos - Xsin
?
my book don't have the answer for this problem
this is a rotation matrix:
$\begin{bmatrix} x \ y \end{bmatrix} = \begin{bmatrix} \cos\theta & -\sin\theta \ \sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix} x' \ y' \end{bmatrix}$
criver
the inverse is the transposed
$\begin{bmatrix} x' \ y' \end{bmatrix} = \begin{bmatrix} \cos\theta & \sin\theta \ -\sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix} x \ y \end{bmatrix}$
criver
it's also the same as setting theta = -theta
i.e. a rotation by angle theta is undone by a rotation by angle -theta
i think so...
if M is 3x2 matrix, how do I show that these are equivalent?
I know if that M1 and M2 are linearly independent, then if $a_1 M_1 + a_2 M_2 = 0$, then $a_1=a_2=0$
bubbly cat
but how do I show the 1st one with the others?
Suppose there exists a vector (s,t) != (0,0)
Such that s * M1 + t * M2 = 0
without loss of generality assume s!= 0 -> M1 = t* M2 / s, they are linearly dependt (you don't evenneed to do this division part but I think it's clearer)
Since there is no such vector -> kernel = {0}
sure I get how b and c are connected
but idk how rank 2 ties into this
i mean I do get it kinda but idk how to show it
Well how is rank defined in your book
I assume it is defined as the dimension of the col/row space
any basic definition works
But if the two vectors are independent the the dimension of the space spanned by their lin comb is 2
The a,b,c parts are actually quite tautological
yea 
In the sense that a) b) c) are the same once you write the definition
I don't even know what there is to prove, unless you have a funny definition of rank in your book
You write the def of kernel, the def of lin indep, the def of rank, and you see the requirements are the same
yea i noticed theres a lot of equivalent definitions for some reason
Lin indep means there are no (s,t) != (0,0) such that s * M + t * M2 = 0, but the kernel is made only of vectors for which s * M1 + t * M2 = 0
Thus kernel= {(0,0)}
the column rank of M is the dimension of the image
but M1, M2 are independent -> 2
oh ok thanks
I think it's trying to get you familiar with the definitions
You're supposed to write ot the definitions of all 3 ad think what those mean
yea I think this was supposed to be mostly review (but I kinda forgot it all lol)
Lin indep of a set of vectors v1,...,vn <-> sum_i s_i * v_i = 0 only for (s1,...,sn) = (0,...,0)
kernel(M) = {v : Mv = 0}
column-rank of M -> the dimension of the column space of M -> column space M = span(M1,...,Mn) = {\sum_i s_i * M_i}
then you get to the definition of dimension and basis
It's testing basic definitions
sure I still remember that I think yea
Just go through the first chapter of axler or any similar lin alg book
this is not linear algebra
given a vector space $V$ with subspace $U$ and a linear map $T : U \to W$ can we extend $T$ to be defined on all of $V$ without assuming $V$ or $W$ is finite?
uli
this is equivalent to showing there exists another subspace $\tilde U$ of $V$ such that $V = \tilde U \oplus U$ since then we could define $T$ on $\tilde U$ no problem
uli
^this holds if aoc is assumed
oh ok, is it false without aoc?
yes
neat
the matter of extension in infinite dim spaces came up recently
coys comment on aoc below
"every vector space has a basis" is equivalent to aoc, so if you dont accept aoc, then there must be a vector space which has no basis
ooh cool
What does a multilinear function mean?
bilinear is specific case of multilinear
ohh ok I get you now
2 slots, linear in each
if you want to specify, you could say n-linear function
got it thx
So what's the motivation for tensors? I see that theres a bunch of definitions but idk how they are motivated
like I see duals, exterior tensors, tensor products, but idk the point of them
I've been thinking that isomorphisms are more intuitive then dimension arguments and I'm wondering why axler doesn't use them when talking about duality, here's a quick draft of how I think about duality with isomorphisms https://uli.rocks/p/duality-done-right which I think its more elegant then axler's dimension juggling proofs
Duality done right - A more elegant view of duality then is presented in axler's book
I've probably missed something or maybe axler does it for pedagogy reasons in case people skipped the previous section, idk
one motivation for tensors is to define differential forms and to investigate what the right type of objects to integrate over a manifold are.
ah I guess I am not at that level to see it motivated yet (soon though maybe)
why do you say "duel" space and not dual space on this page
uhh oops that's embarrassingly bad spelling
i think its kinda funny
hey guys, im confused, for b im doing it, but it says the answer is (x+y,0,y+z) not sure how they got that
ignore, i didnt see that pattern
so you all good ?
yea, i was tripping
They are used alot in differential geometry
To talk about curvature
There are also alot of examples in physics
it's just an easy way to describe multilinear relationships
in the most general definition it reduces the study of multilinear maps to the study of linear maps
which is really nice, because you only have to worry about the latter and not 2-linear, 3-linear, etc separately
I have seen the notation M_{ij} for denoting minors of a matrix (row i removed and column j removed). Is there some standard notation for removing multiple row/columns?
For example if I have diagonal matrices C1 and C2 with 1s and 0s on the diagonal, the C1 M C2 would produce a matrix with a maximum of |C1| non-zero rows and |C2| nonzero columns - and I want to extract those in a reduced matrix with the 0 rows and cols removed.
Would $M|_{C_1 \times C_2}$ makes sense as notation for this?
criver
Or is there something standard
you can make up whatever notation you like as long as you explain it clearly
i can't think of anything standard off the top of my head
what you could do is make C1 and C2 rectangular instead of square
e.g. for C1, remove all rows that are 0 vectors
similarly for the columns of C2
then you would take C1 M C2
and that product gives you what you want
I am aware that I can write it as a restriction and prolongation operator, the problem is that C1 and C2 change coefficients in my problem, so I generally have to keep them square. Though if there's no established notation I guess I will just define this as suggested.
you could still do that in code
no reason to use a matrix for it
you can represent the linear transformation in a matrix free fashion through indexing
much in the same way you want to do it on paper
then what you'd want is a function that takes in M and two lists of indices
then it slices M up based on those lists
Thanks. I am not yet at the code part though
now that we bring this up, you do see a lot of notation of the form $\boldsymbol{M}_\mathcal{I}$, for example
Edd
where \mathcal{I} is an index set
Of tuples?
hmm?
(i,j) in I
i was gonna suggest the notation $\boldsymbol{M}_\mathcal{I}^\mathcal{J}$
Edd
where you keep the rows of M indexed by I, and the columns indexed by J
I think they'd usually put I top, J bottom
however you like ๐ you could also just stick to the one you proposed
as long as the notation is explained clearly somewhere, you can really use whatever you prefer
but i think what you said matches the covariant/contravariant convention
I meant in the context of physics they have row indices top and col indices bottom for (1,1) tensors. But it seems reasonable notation too, thank you.
right
tensors 
can't you represent multilinear maps by a linear map on the product type? or am I missing something
are you talking about tensor products?
a bilinear map A x B -> C corresponds to a unique linear map A tensor B -> C
this is what i meant
Patrick Bateman
check whether they are linearly dependent
put them in a 3x4 matrix
and apply elementary operations to bring them to reduced echelon form
if all rows remain at the very end
then it's all 3 vectors
if one row is zero
then take the other two rows
if two rows are zero
then take the only remaining one
you are basically given a set of vectors
and you want to find a minimal set of vectors that span the same subspace as those
a basis is made up of linearly independent vectors
if you have linear dependence then you have a frame/overcomplete basis instead
the usual for these exercises is that they give you a frame/overcomplete basis and you need to reduce it to a basis
I would stack them horizontally
and then perform row ops
you can of course work with the above too, but you'll have to perform col ops
it does matter when you start juggling variables around if you have a linear system of equations, it's a pain changing the order of your variables
just use the transpose and apply row ops
then they are linearly independent
they simply didn't remove the 0s like you did it seems
but since the vectors are linearly independent
then even the initial are fine
(1,1,-4,-3), (2,0,2,-2), (2,-1,3,2)
the issue is when they are not linearly independent
then you take out only the non-zero rows
if they are dependent then the matrix would not have full row-rank
i.e. you'll get a zero row
at some point through elementary transformations
you can get all rows except 1 to be zero
that would be the case if all vectors lie on a single line
e.g. when they are multiples of each other
then taking a single one of the initial vectors is fine
the number of non-zero rows would be the dimensionality of the spanned subspace
e.g. 1 -> single line passing through the origin
2 -> 2d plane passing through the origin
3 -> 3d hyperplane passing through the origin and so on
you need only k independent vectors to span a k-dimensional subspace
anything more would be linearly dependent
i.e. you'll get a frame/overcomplete basis
How do I create a matrix that transforms a normal vector to <0, 0, 1> and rotates all other vectors accordingly?
Do I just write out the product of two rotation matrices and the normal vector and then solve for the angles? Is there a better way?
What is a normal vector
There is a better way
Like you can just take it to be any arbitrary vector I think
Assuming it's not the 0 vector
And it has length 1
Let n be unit length and let it be orthogonal to f1,...,f_n and you are looking for R such that n' = R n, where n' is also unit length. Compute gi = fi - dot(fi, n') / (1 + dot(n,n')) * (n+n').
So the matrix is
$R= I - \frac{1}{1+n^Tn'}(n+n')n^T$
criver
If I didn't mess up something
Also n' cannot be allowed to be -n
I think that would be a problem
Since then there are infinitely many rotations that fit the bill
A normal vector could certainly be 0 0 -1 and I would want to transform that to 0 0 1
There are infinitely many rotations that do this
You can pick any one of those in that singular case
What I wrote picks the minimal rotation in terms of angle
Well, I would like to know how what you wrote works
But there are infinitely many 180 degree rotations
it considers the 2d plane formed by n and n'
And then does some projections
I don't understand
A rotation changes only the components of a vector parallel to the rotation plane
So you decompose fi = fi_parallel to_plane + fi_orthogonal_to_plane
From this
The matrix is some kind of rejection operation afaik
I think it would help to just set n' to 0 0 1 since that's what I'm concerned with
I can try to write out the whole derivation if you give me some time
Yeah, if you are getting this from somewhere I would also like a link. I don't understand your explanation
I am getting it from my memory ๐
This makes sense
Dot n with z to get angle. Cross n with z to get axis. Convert to axis-angle rotation matrix
the above would work, but it will break also for n = n'
Yeah, so if n = n' just do nothing
also breaks for n=-n' as mine does
anyways
here goes the derivation
you have your two unit vectors n, n' and provided they are not linearly dependent they span a 2D plane
now say you have a vector orthogonal to n, e.g. vector v
decompose it into a piece parallel to the plane and a piece perpendicular to it
$v = v_{\perp} + v_{\parallel}$
criver
I also want to decompose v_parallel in an orthonormal coordinate system
to get that orthonormal coordinate system I would find n_perp that lies in the plane spanned by n,n' and is perpendicular to n
$n_{\perp} = (n'-(n^Tn') n) / \sqrt{1-(n^Tn')^2}$
criver
criver
but n^Tv = 0 since I started with the assumption that they are orthogonal
thus
$v_{\parallel} = (n^T_{\perp}v)n_{\perp}$
criver
The rotated vector v' can be decomposed in a similar way
$v' = v'{\parallel} + v'{\perp}$
criver
criver
and since the rotation doesn't affect the orthogonal parts
then
$v'{\perp} = v{\perp}$
criver
so at the end you get:
$v' = v'{\parallel} + v'{\perp} = (n'{\perp})^Tv')n{\perp}' + v_{\perp} = (n_{\perp})^Tv)n_{\perp}' + v - v_{\parallel} = v + ((n_{\perp})^Tv)(n'{\perp}-n{\perp})$
criver
let me see if I wrote everything correctly
I think I did, now if you plug in the definitions of n_perp and n'_perp you should be able to simplify some terms and end up with what I had
there's probably an easier way to derive the same thing, I am not sure
the key points used are that $v'{\perp} = v{\perp}$ and $n'{\perp}\cdot v' = n{\perp}\cdot v$ which follow from the requirement of this being a rotation
criver
i.e. rotation leaves the space orthogonal to the rotation plane untouched, and angles and lengths are preserved/the dot product being preserved, since x'^Ty = x^TR^TRy = x^Ty
if you don't simplify the final expression
then you're left with
$v' = v + (n_{\perp}\cdot v)(n'{\perp}-n{\perp})$
and if you expand the n_perp and n'_perp in terms of n and n'
you get a term sqrt(1-(n^Tn')^2) in the denominator
which would result in a 0 for n=n' or n=-n', doing some simplifications gets rid of the n=n' singularity
but the n=-n' singularity remains
but this is clear, since there are infinitely many "shortest" rotations that get from n to n' if those are at a 180 angle
criver
the simplified version (after plugging the definitions for n_perp and n_perp') is
$v' = v - \frac{n'\cdot v}{1+n'\cdot n}(n'+n)$
criver
and if you were to extract the matrix from this you get
$v' = \left(I - \frac{1}{1+n'\cdot n}(n'+n)n^T\right)v$
criver
I think something can be done to modify it further to make it work for v not perpendicular to n
actually I think that's trivial
one just needs to remove the v_perp part, apply the above matrix, then add the v_perp part back, actually maybe it's not that simple, since this would refer to the perp part to n and not to the plane
@jagged crown i figured out that reduction problem after thinking abou it a lil more https://www.emathhelp.net/en/calculators/linear-algebra/reduced-row-echelon-form-rref-caclulator/?i=[[1%2C1%2C-1]%2C[2%2C8%2C-6]]&reduced=on
The calculator will find the row echelon form (simple or reduced โ RREF) of the given (augmented if needed) matrix, with steps shown.
we just needed to multiply 2 by its output of x before moving

if u care
b is in the span of the columns of A
ty! ill check it out after my test 
For a symmetric positive definite matrix $A$ one has the optimal step size for gradient descent: $\gamma = \frac{\boldsymbol{r}^T\boldsymbol{r}}{\boldsymbol{r}^T\boldsymbol{A}\boldsymbol{r}}$ where $\boldsymbol{r} = \boldsymbol{b}-\boldsymbol{A}\boldsymbol{x}$.
criver
For solving Ax = b
is there a similar result for nonsymmetric matrices (also for symmetric but indefinite)
Need help in problem 3
My question is
i don't understand where i should place this 1 unit of production from sector 1
Let (u, v) be the Euclidean inner product $R^2$, and let $u=(2, -1), v=(-1, 3), w=(0,-5)$. Verify that $(u+v,w) = (u,w)+(v,w)$. I got $$(u+v,w)=(2 \cdot 0+(-1) \cdot (-5))+((-1) \cdot 0+3 \cdot (-5))=-30$$
John doe
but according to my textbook the answer is equal to 0. I have no idea what I did wrong here. For me it doesen't look like I did anything wrong
u+v = (1,2)
<u+v,w> = -10
<u,w> = 5
<v,w> = -15
dont you just do $(u+v,w)=(u_{1}w_{1}+u_{2}w_{2})+(v_{1}w_{1}+v_{2}w_{2})$
John doe
yea
u can do that
your arithmetic is wrong
what did I do wrong?
$2 \cdot 0+(-1) \cdot (-5)+(-1) \cdot 0+3 \cdot (-5)=-30$
elemen
you get 0 if you do <u, v+w>
can you show me ur work
how did you get it to 0
I computed
first you add the two vectors v and w
v+w = (-1,-2)
then you compute dot product between u=(2,-1) and (-1,-2)
I see, but my textbook did: $(u+v,w)=(u{1}w{1}+u{2}w{2})+(v{1}w{1}+v{2}w{2})$
John doe
hey, i am having trouble understanding this proof, can anyone help?
where is the proof ?
DarQ (Shuri for honorable)
DarQ (Shuri for honorable)
DarQ (Shuri for honorable)
plug into definition
of linear independence
$\lambda_0\cdot 1 + \lambda_1 \cdot x + \dots + \lambda_k \cdot x^k = 0$, show that all lambdas are 0
sum_i c_i x^i = 0 <-> c_i = 0
Lochverstรคrker
oooh
this follows immediately for degree reasons
you can also show its isomorphic to R^{k+1}
I have a super tough question
I don't know if the teacher wrote it wrong or what
It was 3biii
So what's the issue?
Can someone assist because NOBODY I know got it right
I've wrote down the Matrix
So how do i reflect it
B
It's a line passing through the origin
Project a vertex on it
Let the vertex be v
And the projection be v'
U want me to take the partial derivatives basically
no
So i have w=8i+j
$v = v_{\parallel} + v_{\perp}, \quad v_{refl} = v_{\parallel} - v_{\perp}$
criver
The parallel and perp refers to parallel and perp to the line
The line passes through the origin and is parallel to (8,1)
Idk what that array of numbers is
Do you understand how to compute v_perp and v_parallel given v?
I'm not sure but I don't think that's right. I probably could
Can u open this?
No
Do you know how to compute the projection of v onto the line?
v is an arbitrary point in space
Omg math teachers need to explain shit better
A vector
Do you know how to project it onto the line passing through (0,0) and parallel to (8,1)?
Y-b=m(x-a)
Have you studied dot product
This isn't even a dot product question
Wtf
Scaler product we call it
This is
A reflection
Sure you can use the reflection matrix that is presented there probably
It's probably more useful to understand how to derive it yourself though
To project a vector onto another vector, you can use dot(v,u) * u, where u is unit length
Then dot(v,u) * u is the projection of v onto this unit length vector
And you can split v into
$v = v_{\parallel} + v_{\perp} = (v\cdot u) u + v_{\perp} \implies v_{\perp} = v - (v\cdot u)u$
criver
$v' = v_{\parallel} - v_{\perp}$
criver
Try drawing a picture
U can split v into to because its projecting 2 ways. If u look at it like that
One projection on another and vice versa?
I am splitting v into a component parallel to the line, and a component orthogonal to it
Try drawing it
Ok
Having 2 vectors, one unit length
Wait
And then projecting a vector onto it
You should also try drawing: v = v_parallel + v_perp, u, and v' = v_parallel - v_perp
Ok
But that's for part c in my question
For iii I just need the coordinates
And then I'll be able to understand it better then
How do I get the reflected ones?
It's not for your question, it's so that you will understand what I was explaining visually
Then you should have no trouble continuing it on from this point
I am having trouble
Because I don't understand what u told me relates to my question
what does this do?
What is v'
Gets you the original line
This
The original projection
Draw it
is (undefined-0.5) semi-stable as a s classigication?
And you'll figure it out
how do you classify something that is undefined? in directiona fields
Just draw v, u, v', v_perp, v_parallel
What does a tangent bundle of a surface s look like?
Is it just all lines perpendicular to it
Criver
I loved your expiation
But did not understand what any of that had to do with my hexagon matrix question.
Could u walk me through my question step by step and then explain why u taught me what u did?
I want to understand
Why wanted me to use this.
The teacher never taught that.
Did you draw it
Yes
So what is v', you can post you drawing btw, I can check whether it's correct
draw u, denote v_perp and v_parallel, and then draw v'
That's wrong, also they should be all on a single drawing
You should use
v_parallel = v projected on u = (v dot u) * u
Why does the drawing relate to my hexagon. I don't get it.
v_perp = v - v_parallel
The teacher NEVER taught this
I get it
Then you'll be able to reflect any of the vertices using the same approach
You can reflect all of the hexagon's vertices using this approach
Ok
It's irrelevant that the vertices are ordered in a matrix
At the end of the day they are represented through 2d vectors
What's written here
look, draw u and draw v with their tails at the origin
Then draw the projection of v onto u
The projected part is v_parallel, the perpendicular part is v_perp = v - v_parallel
where you have v_parallel = projection of v on u = (v dot u) * u
v_perp = v - v_parallel, and v' = v_parallel - v_perp = (v dot u) * u - v + (v dot u) * u = 2 * (v dot u) * u - v
thus the matrix is:
$v' = \left(\frac{2}{|u|^2}uu^T - I\right)v$
criver
$M = \frac{2}{|u|^2}\begin{bmatrix}u_1^2 & u_1u_2 \ u_1u_2 & u_2^2 \end{bmatrix} - \begin{bmatrix} 1 & 0 \ 0 & 1 \end{bmatrix}$
criver
which is exactly the matrix that you have here
but I have added a normalization factor too, in case u is not unit length (i.e. |u|!=1)
in your case u = (8,1) is not unit length
multiplying every vertex by the matrix, would yield you their reflections
I highly recommend going over some basics on vectors and analytic geometry since you seem to be struggling with that
I can suggest this as something that should be fairly simple and with enough pictures: http://immersivemath.com/ila/index.html
hey, can someone help me understand c? I don't know how to prove this
$q(x) = \sum_{i=0}^n b_ix^i = \sum_{i=0}^n a_ix^i + \sum_{i=1}^n i a_i x^i = p(x) + xp'(x)$
criver
so, b_ix^i is q(x)? just to make sure
Yes
I picked b for the coef of q and a for the coef of p
All you have to do is expand the polynomials and the derivative
okay, give me a sec
i have q(x)=p(x)-xp'(x) written out, so, where do i go from here?
they have written q(x) = p(x) +xp'(x)
Now set q(x) = sum_i b_i x^i and p(x) = sum_i a_i x^i and substitute in the equality
i wrote them out with the sums already done, got b_1+...+b_n x^n= a_0+2a_1 x+...+(n+1) a_n x^n
how do i finish the proof from here?
It's wrong though
What you wrote is wrong
am i setting p(x) to q(x)?
can you explain why I would do that?
what about this, can someone help guide me through this one?
For any vector $x \in U, x = a_1 f_1 + \dots a_m f_m$
elemen
$U^\perp$ is all vectors $y$ such that $x^\top y = a_1f_1^\top y + \dots + a_mf_m^\top y = 0$
elemen
so if we define the matrix $ A = [f_1, f_2,\dots,f_m]$ then y must be in the nullspace of $A^\top$
elemen
and now you can show that $f_{m+1},\dots,f_{n}$ is a basis for the nullspace of $A^\top$
elemen
^ I'm not sure if all my steps are right but hope it helps
what is the symbol above x in this
.
transpose
fml, i haven't seen that at all this unit
transpose just converts column vector to row vector and vice versa.
yep, but not sure how to use it bevcause i haven't been asked to use them once
okay you can just replace $x^\top y$ with $<x,y>$
elemen
<x,y> is inner product
the transpose exchanges rows and columns in a matrix, it works the same for vectors since column vectors are (n by 1) matrices and row vectors are (1 by n) matrices
and vise versa. this is especially useful because $x^Ty$ denotes the dot product
$$
x^Ty = \begin{bmatrix}
1 \ 2 \ 3
\end{bmatrix}^T
\begin{bmatrix}
3 \ 4 \ 5
\end{bmatrix}
= \begin{bmatrix} 1 & 2 & 3 \end{bmatrix} \begin{bmatrix} 3 \ 4 \ 5 \end{bmatrix}
= 1\cdot 3 + 2\cdot 4 + 3\cdot 5
$$
uli
via the same matrix multiply rule you're hopefully familiar with
http://matrixmultiplication.xyz/ is a nice way to think about it btw
An interactive matrix multiplication calculator for educational purposes
not really confused abot that part, just what the last person wrote out, like this proof makes literally no sense to me. I just can't crack it
I've tried it for a day and a half now
well, by the definition of span we need only show every $u \in U^{\perp}$ can be written $u = a_1\mathbf f_1 + \dots a_n\mathbf f_n$ for some $a_n$
uli
since every $u \in U^{\perp}$ is also in $\mathbb R^n$ we can write $u$ as a combination of the basis for $\mathbf R^n$
$$
u = a_1 f_1 + \dots + a_nf_n
$$
uli
(oops, in the first message I sent I should have used m for the final subscript, ignore that)
anyway, since $u \in U^{\perp}$ is perpendicular to every $f_1,\dots,f_m$ in $U$ we must have $\langle u, f_j\rangle = 0$ for $1 \le j \le m$
uli
(oops in the first message I should have said we want to show $u = a_{m+1}f_{m+1} + \dots + a_nf_n$ ignore that)
uli
using the linearity of the inner product and writing u as a combination of f1,...,fn as above gives
$$
\langle u, f_j\rangle = a_1\langle f_1, f_j\rangle + \dots + a_n\langle f_n, f_j\rangle
$$
since the basis is orthogonal, $\langle f_k, f_j\rangle = 0$ for $j \ne k$ leaving us with
$$
\langle u, f_j\rangle = 0 = a_j\langle f_j, f_j\rangle
$$
uli
(this is why orthogonal bases are so nice)
inner product is dot product correct?
yes
since $\langle f_j,f_j\rangle > 0$ this shows $a_j = 0$ for all $1 \le j \le m$
uli
which turns this into
$$
u = a_{m+1} f_{m+1} + \dots + a_{n}f_{n}
$$
as desired.
uli
I wrote out all the steps in a ton of detail but really it consists of like 2 steps
-
write $u = a_1f_1 + \dots + a_nf_n$ for $u \in U^{\perp}$
-
dot both sides with $f_j$ for $1 \le j \le m$ to get $\langle u, f_j\rangle = 0 = a_j\langle f_j, f_j\rangle$ implying $a_j = 0$
-
hence $u = a_{m+1}f_{m+1} + \dots + a_nf_n$ meaning $u \in \text{span}(f_{m+1}, \dots, f_n)$ so $U^{\perp} = \text{span}(f_{m+1},\dots,f_n)$ as desired.
uli
the dot product between f_k and f_i, which is zero since f_1, ..., f_n is an orthogonal basis
which is set is f_k? or is that arbitrary?
f_k denotes the kth element in f_1,...,f_n. in this post it's saying the only term that isn't zero is the f_j, f_j one
because the others die from orthogonality
$$
\langle u, f_j\rangle = a_1\langle f_1, f_j\rangle + \dots + a_n\langle f_n, f_j\rangle = 0 + \dots + a_j\langle f_j, f_j\rangle + 0 + \dots + 0
$$
uli
basically any argument involving orthogonal vectors will involve taking a dot product at some point and cancelling a bunch
how do know <f_i,f_i> =0?
? it isn't, <f_i, f_i> is the length ||f||^2
you mean <f_i, f_j> = 0
and that's true by the definition of orthogonal
you might be used to seeing it as $f_i^Tf_j = 0$
it's the same thing
uli
it's greater then zero
because the length of f_j is nonzero
you need that to show aj = 0 rigorously
i thought it could =0 as well
only if f_j is zero, which it can't be because it's part of the basis f_1, ..., f_n
ie. if it were zero the list f_1, ..., f_n would be linearly dependent hence not a basis
oh okay, that makes sense, because a 0 vector cannot be a part of a basis
thank you!
np
Not quite math but I know alot of people use latex here. Does anyone know how to write matrix in respect to a,b basis in latex
$[T]_a^b$ for transformation T
Mosh
that's standard notation for $T:V\to W$ with basis of V a, basis of W b
Mosh
(iirc at least)
ty
first show that every norm is continuous.
show that the max norm unit ball {x in R^n : |x|_{max} = 1} is compact in R^n.
then show that every norm is equivalent to the max norm. for one direction, make use of the fact that R^n is finite dimensional, i.e., it has a finite basis. for the other direction, use the extreme value theorem
You can just show directly that every open ball according to N1 contains some open ball according to N2, for arbitrary norms N1 and N2. No need to give special status to the max-norm.
The key step is to decide on the radius of the N2-ball, which you can do by comparing the N1 and N2 lengths of each basis vector, and using the triangle equality to find an upper bound for the N1-length of a vector that fits in the N2 unit ball.
Hmm, no, actually I'm talking nonsense. The direct way I was imagining won't work for an arbitrary basis. Ignore, sorry.
Wait can someone try to find this constraint
How did you find the one you already have?
I would do Gaussian elimination on your matrix. That ends with two rows that have zeroes in the first three columns and a combination of xyzt in the last. Those two combinations clearly need to vanish for the last column to be a linear combination of the three first.
If you only get one row of constraints, you must have made a mistake along the way.
yeahh nah
it was fine
i was doing this for a friend and we were just stumped with the nasty row echelon form
but if u keep the rows in the same place and not do it through an adhoc method
it comes out nicely somehow lol
hey how would I go about putting a system of equations in matrix form if I'm given this
x'(t) = m1(y(t) โ x(t)) = โm1x(t) + m1y(t)
y'(t) = m2(x(t) โ y(t)) = m2x(t) โ m2y(t)
I'm given that m1 = 0.1 and m2 = 0.2
like I know how to do normal matrices alright, but I'm a bit confused what to do since I have 2 equal signs
am I just supposed to break it in half and make both parts equal to the x'(t) and y'(t)?
Can someone help me? I need to convert the Airy differential equation to Bessel equation
Hi I posted a question
I saw the following in a paper
A matrix A is given, and they construct a matrix Z that spans its null-space
Then they claim that the projection of a vector on the null space of A is given by:
Z(Z^TZ)^{-1}Z^T
this I could reverse engineer from
$\min_x|Zx - y|^2 \implies x = (Z^TZ)^{-1}Z^Ty$
criver
so they want the closest vector Zx to y, and that is why the projection is Z(Z^TZ)^{-1}Z^T - I get this part
Then they claim that I-A^T(AA^T)^{-1}A = Z(Z^TZ)^{-1}Z
and I am not sure how they come to this conclusion
as far as I can tell A^T(AA^T)^{-1}A supposedly projects on A^T, that is, it tries to find the best approximation A^Tz to some vector w
then with the identity and the subtraction, we take away that part
As I understand it given a vector w
$w_{\parallel A^T} = A\min_{z}|A^Tz-w|^2,, w_{\perp A^T} = w - w_{\parallel A^T}$
criver
and the claim is that w_{perp A^T} = w_{parallel Z}
Z is the nullspace of A though, so what is A^T doing there
pinv(A) gives the projection onto A and pinv(Z) gives projection onto orthogonal subspace of A
as edd said, they are orthogonal
if you do the SVD of that mess, you get an orthogonal projection VV^T
and then I - VV^T projects onto the ortho comp of the span of V
yes but why is it A^T paired with Z instead of A paired with Z
hey guys, i have a question, for this, how do i find the basis for ker T? do i do each transformation individually?
for that you need to find the kerT first
take any arbitrary polynomial a(1-x)+b(1+x)+c(1-xยฒ) and check when T(p)=0
probably wanna convince yourself that the given polynomials are a basis for all polys of order <= 2 first
I thought of writing a+bx+cxยฒ but then I thought nah, OP can do it 
ok, yeah the nullspace of A is orthogonal to the row space A^T
might wanna look up fundamental theorem of linalg
the SVD does quick work of these properties
yeah I have even had to prove that
(that's kinda circular)
brain for some reason doesn't turn on though unless I have solved enough problems with it
so wait, i take an arbitrary polynonmial first?
What happens though if AA^T is singular? I am assuming I would have to compute its eigendecomposition and take the pseudo-inverse in the sense of \lambda_i' = 1/\lambda_i for lambda_i !=0, and \lambda_i' = 0 for lambda_i = 0.
this projection seems to rely on the existence on the inverse in: A^T(AA^T)^{-1}A
you actually WANT that to be singular
otherwise the projection is an identity matrix and its complement is a 0 matrix, and that's trivial
indeed you use the pseudo inverse
I am talking about: AA^T
yes
if it's singular then I - A^T(AA^T)^{-1}A is ill-defined
you use the pseudo inverse
there's a fundamental theorem of linalg?
strang
it's the name made popular by strang
which one?
the relationship among col space, left null space, null space, and row space
nice
that doesn't sound right, I am fairly sure that what I have is non-singular for most cases, unless I set contradictory constraints
A is mxn with full row-rank
then the null space is just the 0 vector and you don't have to worry
the issue is I want to feed it contradictory constraints and satisfy those in an L2 sense
what is "contradictory constraints"
e.g. introducing linearly dependent rows
Ax = b
where A has been supplemented with linearly dependent rows
the usual A is # rows < # cols and linearly independent rows
then AA^T is non-singular
however if I supplement linearly dependent rows, then AA^T is singular
where does the word "contradictory" come from though
A is a constraint matrix in my case, but that's probably not very relevant, contradictory constraints are linearly dependent
u_i = b1 and u_i = b2 with b1!=b2
that makes the rows linearly independent
the standard setting where you have this is linear least squares
(if you're looking at the augmented mat)
that's fine
yes
wdym yes, i'm saying the opposite as you
My understanding is that then the pseudo-inverse would require me to do an eigendecomposition
but I would like to avoid that
basically is there any way to rewrite this to be non-singular
no
the whole point of this is that in some cases there is no exact sol, and in others there are infinitely many
that I understand but I want to apply this for the aformentioned projection
I am guessing the infinitely many solutions problem is reduced by requiring that the initial "free" components remain untouched
you can add in regularization as you see fit
which I think is exactly what the pseudoinverse through the eigendecomposition would do
it depends
when there are infinitely many sols, you get a projection onto the row space
and well, in the other case as well
this corresponds to a minimum norm solution
(2-norm)
yes it's a L2 norm minimum solution
sorry, but i never understtod what you said, so, this is what i have
Basically it's the standard least squares, but if you have introduced linearly dependent columns in there, making for example A^TA singular
how do people deal with this in practice?
use a pseudo inv
Is this what I am supposed to start with?
eigendecomposition for large matrices is expensive
I usually use a CG solver, but I am worried about the matrix being singular in this case
if you mean in general in practice, if you can't afford to compute the eigendecomp, you use an iterative procedure of your choice
gradient methods, (quasi) newton methods, with or without regularizers
or something heuristic/data driven
in this case you rely on regularization to find the solution you want
can anyone help me by any chance?
this is my question, i just dont know how to find a basis in here.
so I am guessing the issue arises from the fact that neither the rows nor the columns are linearly independent. While in the standard case if the rows are linearly dependent one uses A^TA instead, and if the columns are linearly dependent then AA^T.
seb, you are looking for scalars a,b,c, so that T{av1 + bv2 + cv3} = aT{v1} + bT{v2} + cT{v3} = [0,0;0,0]
where v1 = (1-x), v2 = (1+x), v3 = (1-x^2)
in general you really don't want to use the gramian unless you have good reasons to
these matrices have a really bad condition number
so you'd need preconditioning or very stable algorithms. you'd also need some way to accelerate convergence
the reason is that I need a least squares solution, because the constraints are conflicting
I am aware that the condition number is squared
I just haven't come up with anything better
what i mean is that you can solve || Ax - y || without relying on a system involving AA^T
how?
it still shows up implicitly in some algs, but can be circumvented
you get A^TAx = A^Ty from setting grad = 0
as far as I know one cannot escape this
to be sure, I do not apply CG on A^TAx = A^Ty
I use CGNR, which is stabilized for the normal equations
it's just that I am used to working with A^TA positive definite, and now it's positive semi-definite
albeit I just found some papers that the algorithm supposedly converges even in that case
you could use gradient-free methods
you mean direct solvers? those cannot handle the matrix sizes I am working with
i don't mean direct solvers
stuff like coordinate descent, for example, only relies on per-variable derivatives. instead of dealing with a gramian that pops up when looking at the gradient, you instead deal with 2-norms of columns of the matrix A
there's also stuff like sim. annealing and whatnot
I haven't looked at coordinate descent, but sounds like gradients per coordinate, as far as simulated annealing goes - I have done that for other problems, it's extremely slow (compared to methods that rely on gradient information)
I just need to think over what this means geometrically, because there should certainly be a way to simplify this in order to get a non-singular matrix inversion
e.g. somehow removing the null-space out
well if you find it, do publish it in a paper
it's probably been done already and I just can't find it
or it's probably too trivial to deserve a paper
you can compute the SVD by doing row reduction twice btw
it's not SUPER expensive
you would only have trouble if the matrix is humongous
e.g. can't fit in memory
it is humongous and sparse
I gave up on anything non-iterative a while ago
even LDL^T Cholesky breaks on it
the incomplete one too
breaks in what sense
as in D_ii < 0 for some i, that is, numerical precision
and that's with double precision
oh well
It's alright I'll figure something out, CGNR is supposed to work
at any rate, what you were asking for doesn't exist
you can solve a different problem that gives the same solution, but you can't rewrite it in terms of full rank stuff
I just wanted to figure out something theoretical, but while discussing with you I realized that having both rows and columns linearly dependent poses quite a few issues
since the crux of the problem is precisely that the matrix is not full rank
The problem has a unique solution definitely, I just have to figure out what that translates into algebraically, since I actually want only one solution out of the infinitely many, I just need to be more careful with the modeling.
your matrix A has full column rank?
no that's the problem
the matrix A has full row rank if there are no conflicting constraints
then it does not have a unique solution
then it's trivial and one just does AA^T
๐ be careful with the wording
but when I add conflicting constraints
if you want a specific one, the only way is to add a regularization term
then I want those to be satisfied in a least squares sense
so my guess is that theoretically the conflicting constraints could be taken out and turned into a single one
and then I'll get a full row-rank matrix
e.g. if I I know that row 1,3,5 are linearly dependent, I could take them out, and apply A^TA for those - scratch that, I basically need to turn them into a single constraint that satisfies the 3 in some least square sense
but now you have to exactly solve for this variable too
Lagrange multipliers require non-conflicting constraints
i'm sorry, your nomenclature is throwing me off
what i mean is adding extra variables and requiring them to be positive, for example
Lagrange multipliers require linearly independent rows in the constraint matrix Ax = b
and using lagrange mults for these new constraints
Ax = b is normally called the cost or objective, not constraint
if it were a constraint, the answer to your question formally is "the feasible set is empty"
and you're done
$\min_x f(x), \quad \text{s.t.} Ax = b$
criver
$\begin{bmatrix} Q & A^T \ A & 0 \end{bmatrix} \begin{bmatrix} x \ \lambda\end{bmatrix} = \begin{bmatrix} 0 \ b \end{bmatrix}$
criver
yes the system is inconsistent, instead I replace it with ||Ax - b||^2
the standard way to solve this is A^TAx = A^Tb, except A^TA is singular in this case (as it should be in order to have some interesting feasibility set)
it's a way to satisfy the constraints in a weak sense (L2)
it's equivalent to the standard approach for non-conflicting constraints
so you mean $\min_x f(x) ,,s.t. ,, \Vert Ax - y \Vert_2^2 < \epsilon$
Edd
$\min_x x^TQx, \quad \text{s.t.} \min_{x}|Ax-y|^2$
criver
I am not sure how to formulate it properly
the way i wrote it
the constraints are more important than f(x)
there's no "more important"
in the sense that
if they were non-conflicting
then it would be Ax = b
the minimisation for the constraints is only to break up contradictions there
once this is done the x is "fixed"
e.g. you can reformulate
$min_x x^TQx, \text{s.t.} Ax = b \implies \\min_y y^TZ^TQZy = -Z^TQA^T(AA^T)^{-1}b$
where Z is the null space of A
criver
so you can even rewrite it without any constraints - i.e. they can be factored out before the whole thing, since they are enforced strongly
Either way, this is probably too technical
seems pretty straightforward
if you want to avoid those (pseudo)inverses, the whole problem can be written as a problem with a symmetric psd mat
baiscally I know a unique minimizer exists
yes, but I replace Ax = b with min_x|Ax-b|^2
I can actually give a simple example
imagine a^T x = b is one constraint
you don't have to, it's been clear for like 1hr lol
Uh anybody can tell me why Im not getting the same result using Sarrus as what symbolab or wolfram is outputing?
a^Tx = b, a^Tx = c, the null-space is still (n-1), and I want the x to "satisfy" the above in a least square sense
except the matrix (AA^T) is singular in that case (since the same row repeats two times)
This is what I get, but symbolab outputs 4 instead of 18
4 makes it actually solvable, 18 doesnt
Uh all the x in there are me being dumb and forgetting its lambda
And not x
A thing that would make the matrix non-singular is substituting those two constraints with a single one, e.g. a^Tx = d, such that it minimise (a^Tx-b)^2 + (a^Tx-c)^2. I think I just have to involve Z somewhere in there to reduce this to a non-singular system.
Yeah, I think I figured it out while trying to explain it
I just have to decompose x = Vx_A + Zx_z, such that AZ = 0, and V has independent columns ([V Z] should span R^n). Then Ax = AVx_A = b and (AV)^T(AV) is non-singular
So I just need to solve ||Ax-b||^2 in the space orthogonal to the null-space of A
my mistake was picking V = A^T, which works in the case where A has linearly independent rows, but doesn't otherwise
that's indeed the classical solution to the problem
you can use an iterative alg to build a basis for the null space
I haven't though yet about the algorithm part
I am happy I finally figured out what I was doing wrong theoretically though
i forget the name for this
something along the lines of generalized beamforming in the sigproc community
If possible I wanted to avoid building the basis, that's why I was looking at I-VV^ instead of VV^T
Lmao wat
it shows up in papers from like 50 years ago
Gmres?
hmm?
gmres builds a basis, which is why I avoid it, because I don't want to carry the vectors around with me
i meant your whole problem, cost func and constraints
afaik that's just quadratic programming, albeit they do not introduce conflicting constraints usually
standard optimization books have the non conflicting version
if you want to do it this way, anyway, you need a basis for the image or the null space
you guys are still talking about optimization 
otherwise, use lagrange on the form with epsilon
your form also shows up in opt books lol
They were avoiding doing so in the nonconflicting case through the I - A^T(AA^T)^{-1}A
if it's a QPP then yes, I hated solving those
The conflicting thing? Please do tell so I can read up on this
just look up convex up
