#linear-algebra
2 messages · Page 223 of 1
wait
the 0 vector is one such that v + 0 = v so
so you'd need (0,1) as the zero vector
For vector sum to be 0, you can define (-a_1, 0)?
got it
"-v"
and the zero vector would be (0,1)
gotta get a bit creative
we're busy working on a problem rn
@lavish jewel If additive identity is (0,1) and the sum of additive inverse is (0,0), is it fine for the elements to be different?
hmm no
i guess we have to take another look at the additive inverse then
how about (-a1, 1/a2)?
In the axioms, it's shown that there exists element denoted by 0 such that x+0 = x
and also there exists an element such that x+y= 0
yeah, it means the same 0 vector for both
Yes. But the zero vector isn't the same in this vector space
Is it a contradiction?
that's why i said we had to take another look, try (-a1, 1/a2)
this should yield (0,1), which does work
Okay
this is also a good point for you to look back at the previous question with that v = {0} thing
maybe now you see what i meant by the 0 vector not necessarily being the 0 you were thinking of
or + not being the usual addition
is every diagonalizable matrix invertible?
what are you're thoughts
$(c+d)(a_1,a_2) = (ca_1 , a_2) + (da_1, a_2) = (ca_1 + da_1,a_2 a_2)$
Researcher in Pre-algebra
Is this right?
Looks right to me
What if c+d = e where e belongs to R, and $e(a_1,a_2) = (ea_1,a2) \neq (ca_1 + da_1,a_2 a_2)$?
Researcher in Pre-algebra
@lavish jewel
@teal grotto I dont think they are
why
isnt zero matrix a counterexample for this?
yes. just wanted him to get to that lol
any diagonal matrix with at least one 0 on the diagonal works as a counter example, since the determinant of a diagonal matrix is the product of its diagonal entries
meguuuuu
<@&286206848099549185>
I am given a vector $x \in C^n$ and $b \in C^m$ and $x$ minimizes $\norm{Ax-b}$ and $A$ is $m \cross n$ complex matrix
meguuuuu
@drowsy flower do an orthogonal projection onto the image of A
okay let me see, also idk svd so
you did it backwards, but yes
Backwards?
Okay
Got it
then use distribution and check with the original
So this can't be a vector space?
you cannot do the step in the middle cuz of the coefficients
i think not, cuz you'd get a2^2 yeah?
Yeah
sounds about right
step in the middle?
this here
the chunk in the middle is false
it would be true iwth usual addition, but not here
Okayy
Then how is it done?
Won't that mean scalar distributivity holds if that chunk is false
backwards
as we just did above
you have (c+d) (a1,a2) = ( (c+d) a1, a2)
and you wanna check if this is equal to c(a1,a2) + d(a1,a2)
but the latter is equal to ( (c+d) a1, a2^2)
okayy
so it isn't distributive
I don't understand this question
Has someone tried to imagine what is the determinant of a non-square matrix?
I know that if the matrix is not square, the determinant doesn't exist
But idk if there exists any extension of what would be the determinant for non square matrices
it's asking you if you have addition and multiplication by real scalars, do we still have that the a_i are in C
Oh yeah
Do I just say "yes because adding real number to a complex number, it still has complex number as codomain"?
that's not a requirement though
in fact, there is no addition of vector and scalar
Okay, they said coordinate addition and multiplication
remember you'Re looking at addition V x V -> V and multiplication F x V -> V
right. we have usual addition, usual multiplication, and these are all associative, commutative, etc, and the product of a real and a complex number is complex
Let V = {(a1, a2, . . . , an): a_i ∈ R for i = 1, 2, . . . n}; so V is a vector
space over R by Example 1. Is V a vector space over the field of
complex numbers with the operations of coordinatewise addition and
multiplication?
V can't have R as a codomain so it's not the same vector space right
right
not for scalar multiplication
addition is R^n x R^n -> R^n, and scalar mult is C x R^n -> C^n
Got it
Changing the vector space itself
got it
Should I move to next topic?
There are 7 more problems
i'd recommend you try a few more, especially if there are harder ones at the end
Okay
$V=\mathbb{R}\cup{\infty} \ u+v=min(u,v) \ c\times u=c+u$ Verify if this is a $\mathbb{R}$-vector space or not
@radiant yarrow if you want one to do
Mosh
can anyone here help explain the derivation of the perspective projection matrix in 3D graphics.
R union with element infinity?
yes, so the vectors are all the real numbers and infinity (which behaves as you expect it to)
ie it's the "biggest real number"
It behaves as follows: min(u, infty) = u and u+infty = infty for any real u
But Aren't vector spaces defined in finite numbers?
That's all you need
what do you mean?
Vector spaces can be arbitrary sets
You've had matrices, sequences, polynomials, etc. as examples
Not just numbers.
Im also not saying if it's a vector space or not.. the question was to verify if it is or not
Also giving this one cause this was the one that fucked me over when I learned about vector spaces
yes, for scalar c and vectors u and v
Yes, F is a F-vector space
no
c isnt a vector, it's a scalar, so you cant add a scalar and a vector
it is
In u+v, the + refers to vector addition which is being defined to mean min(u,v). In c+u, the + refers to normal addition of real numbers. Same symbol, different meanings.
$1\times u = 1+u$
Mosh
give me a min
for example
Now my prof did use the O symbols for the operations but I couldnt be asked 
c+u isn't possible where + is vector addition. But the + here is normal addition which is possible because it's the scalar multiplication here.
$u\oplus v = min(u,v) \ c\otimes u = c+u$
Mosh
What do you mean by that?
Are you saying they have or don't have an additive inverse?
where as no additive inverse for negative numbers
positive numbers have one inverse
negative ones don't
Before you can find inverses
You must identify the identity
What’s the additive identity if there is one?
u itself
No
which vector specifically is the identity
An identity is one element in the whole space
pog infinity as 0 vector moment

Yeah.. what's the problem?
okay
Mosh
Did you see example 5?
Hey quick concern
I have the urge to do x_1 * v_1 + x_2 * v_2 = (240, 2824)
This would be the overall production of the company to get this much resources
But when it says each, wouldn't that be 2 separate equations?
x_1 * v_1 = (240, 2824)
and
x_2 * v_2 = (240, 2824)
This would be the number of hours, x_1 and x_2, it takes mine 1 and 2 respectively to get this much resources
Need some help on how to approach this problem
the middle two columns are linearly dependent
start by looking at A(0,1,0,0) and A(0,0,1,0)
also the first and last columns are the same. that should be a big hint
Sorry I'm trying to figure out what you are leading to, are you referring to the identity matrix for A(0,1,0,0) and A(0,0,1,0)?
@fleet orbit (0,1,0,0) is a column vector. A(0,1,0,0) is just regular matrix multiplication on the left by A
this is what I could make out for w and w'
awesome. now look at the first and last columns
yw : )
yeah its so hard to find these methods though, wish my professor showed us more examples
I keep ending up trying to do something like A*(w1,w2,w3,w4)=A*(w1',w2',w3',w4') because im so used to just going to the coefficient matrix and setting up a linear system
try to think what free means and it'll give you a clue
it might help clarity if you get the pivots to be 1, though i don't think all profs require that
how are you trying this
(Speaking on a friend’s account) Initially i tried to find the inverse matrix and premultiply but you cant find an inverse of 2x1 matrix
ok you can see it's a linear transformation though right?
can you think idealistically?
like B' is a little far, if it were close that's one you already know
(and maybe figuring that out will trigger the next bit)
Chief imma be real wit you, I’m a year 11 student and I got no clue what linear transformation is
oh no problem, hahahha
ok but visually you do
you know flipping stuff?
rotating stuff
Yes
cool, matrix multiplication represents those types of trasnformations
so you have 1 for a flip,
one for just repeating the image (identity)
etc
what i'm saying is, find a simpler matrix first (for example, for a perfectly symmetrical flip)
and then worry about the next step
Trial and error?
not quite
informed exploration hhahahah
i always think of trial and error as just random
can you see how A and C’ are related in the picture
Yes
which other ones just visually look related by that same reasoning
C and A’
and then B and B’ have to be related, just by process of elimination
so K is a two by two matrix. can you see how the second column has to be (2,1)?
oh let me correct myself. i was looking at the image to idealistically.
A and A’, B and B’, and C and C’ are all related
but your still right about the column
the one thing that’s great is that we know the second column of K has to be (2,1)
Roight
since K is a small enough matrix, you could just try to solve for the first column of K directly
even when you get it though i'd play around transforming the thing
the more intuition you build the less you have to stress about abstract problem solving
The answer is in the textbook but the teacher posed the question but he didn’t even know the answer
that's a bit sad
but linear algebra is amazing, you can leverage it for a lot of things
The context of the question is to find A’, B’ and C’ with the known transformation matrix but he posed it as how can you find the transformation matrix itself just through the image and the end result
right
but that's also not very helpful hahaha
the flavor of the transformation is the key
Yeah I know hahahahaha
the name for this
is a shear @clever merlin
it's a very badass type of trasnformation
shear: a strain in the structure of a substance produced by pressure, when its layers are laterally shifted in relation to each other.
i don’t think that’s true
not a shear?
i stand corrected nvm
yea nvm. that makes sense. volume doesn’t get scaled by a shear
i honestly forgot haha
i'm not shearing enough in my life...
clearly i'm doing something wrong
🥲
lol
Anticipation
implies orthogonal, i can only say real part <u,v>=0
nvm i found counterexample this is not true
u = (1,1,i) v = (1,-1,1) i think
lol or just u = 1 v = i
the norm of a vector $(z_1, \dots, z_n) \in \bC^n$ is $\sqrt{\sum |z_k|^2}$ not $\sqrt{\sum z_k^2}$
Ann
also |1+i|^2 = |1|^2 + |i|^2
and also the inner product in $\bC^n$ is $\ang{z,w} = \sum z_k \overline{w_k}$
Ann
yea so what i said above is not true for C right
our problems set i think is asking us to prove for R so the fwd direction is. not true for C
cuz this is for L^2 space
I don't think your counterexamples work, 1 and i aren't orthogonal for example
also
$p(x)=a_0+a_1x+a_2x^2+a_3x^3+a_4x^4+a_5x^5$
Tim O'Brien
Let velocity values be x, and force values their corresponding outputs
Yeah you got it
I thought you were solving for x in rref
then we get teh exact coefficients
yeah we would have too many unknowns
but yeah computational/applied stuff like this is really chill
Do you know where I can find more stuff like this?
like using computers to fit polynomioals/approximate/etc..
IM not quite sure how poly regression works
look up splines
alright
This looks right up my allex
but its not a function
this website says its a function
xD
what isnt
These circles
/what website
id check out introduction to statistical learning
its free
and read the section on this
and it comes with code\
ok cool
or it should i think
holy cr*p
this is nice
seems like a lot of reading
and not a lot of exercises though

its like the baby bible for this stuff
baby bible?
yea the big boy bible is elements of stat learning I struggled with this one
but I couldnt self study that without a professor or someone to rouintely talk to
you actually might be able to now given #advanced-probability
here
This seems like really important stuff to learn
but like
looking at it
lots of words
complicated
no exercises
doesn't seem that fun to self study
oh shit yeha
theres R
oh yeah nevermind
there are exercises
but still I want to get through linear algebra first im havin ga lot of fun with this
I would
do I need calc 3 for stats?
nah not really
some vector calculus is helpful for like neural netsa
but even then
its mostly just notational heavy imo
what math do you do
DS?
Oh that's why you know this so well
You going to be makign big bucks
in the future
I still don't know what I want to study
hahahahha i actually just want to do research
me too but its hard
oh neuroscience right
we were getting coefficients
yea
i thought you wanted the x terms lmao
no I understnad this p well
yeah I learned how to do that as well
oh nice!!!
that idk when they cut it off
yeah maybe not
the rounding error should be super minimal
hhahah it is

i come here way more often than I should
oh yea i was trying to give an example norm(u)^2 + norm(v)^2 = norm(u+v)^2 but the functions are not orthogonal
Yeah, that wasn't gonna work
Is anyone able to critique my (potentially dodgy) linear algebra proof? Thanks
I'm not that good with Latex so it might be hard to read
In the fundamental theorem step,it should be "dim null ST_1+dim range T_2"
@native rampart I see, thanks for pointing that out for me
@native rampart Sorry for all the questions, but does that make the proof invalid? Or could it still be ok if were to fix that up?
Nah, just change it. Typos aren't that big of a deal
Yay! I was feeling adventurous and decided to deviate from the typical techniques the textbook used to solve its problems. It's good to know that the proof worked out 😄
Welp,there is one problem you haven't directly addressed. What if v is in null(T_1) but not in null(T_2)
If you address that,you can just straightaway define S and you are done
After you address that you get rank(ST_1)=rank(T_2) which doesn't mean T_2=ST_1
It means T_2=A ST_1 for some invertible operator A
Thanks for the heads up, I'm going to try and improve the proof now and see if I can make it a bit nicer
Thanks for the help!
If you want a nicer proof for only if part:
Let {$e_{1},e_{2}...e_{k}$} be a basis of null($T_{1}$). Now extend this to a basis of V. i.e.,basis of V is {$e_1,e_2...e_k,e_{k+1}...e_n$}
That's a much nicer proof technique than the one I was using.
You see that ${T_1(e_{k+1}),T_1(e_{k+2})...T_1(e_{n})}$ will form a basis for range($T_1$)
So you can just choose a S such that $ST_1(e_i)=T_2(e_i) \forall i \in {k+1,k+2...n}$
Buncho Dragons
I like how elegant it is too
Now it turns out that T_2(e_i)=0 for the remaining elements in {1,2,3...k}
So such a S exists
Thanks for showing me this method
I'm going to try to replicate it in some of the other exercise problems
Linear algebra brrrrrr
When solving Ax=0 and Ax=b
Why are the terms of the free variable's the same ? Because reduced echelon form is always the same ?
The only thing which makes the result differ are the coefficients I solved for, however the terms associated with the parametric vector form in the linear combination stay the same
The book I'm reading says something about it however it slightly confuses me
the thing is that 0 + b = b
so any x that solves Ax = 0 can be added to any x that solves Ax = b, and you still get b
just by linearity, yeah?
let's call them x_0 and x_b so we don't confuse them
Ax_0 = 0, Ax_b = b
Ah I see, so all solutions in a consistent system with free variables are parallel lines ?
i'm not sure this is true
pretty sure it isn't
In a R^2 system *
Okay then I'm confused
in R^2, the domain is a plane
If there is 1 free variable, the solutions form a line
If there are 2, they'd form a plane
the plane
Okay so if there is 1 free variable
The solutions are parallel lines
and with 2 parallel planes ?
hmmm aight i see what you're trying to do
1 free var gives you just 1 line
you have x_b + t*x_b, which is of the form of a line
just one line
Ah I see
you don't get two parallel lines unless you go non-linear i'm pretty sure
But at the end it is all parallel because each time you just change the vector you add to the free variable terms
Yes I understand
i meant b vectors, idk what you meant by base there
a + x_1(b)
For example, and then the a changes
Depending on the right-side variables of the system of linear equations
i'm not sure you can have that in 2D
if you change only a and nothing else, it's the same line
I understand but it's parallel to the other a's right
lets say we have Ax = b, and A is rank 1. the system has a solution if b is in the span of A
Yes
so b is of the form c*a, for some vector a in the columns of A
Yes
c sub zero ?
if you change c (which is the same as changing the vector b), the resulting line need not be parallel to the previous one
Ah okay I understand
i'm pretty sure we can make a simple counterexample
and t is the same in all solutions of the same matrix, when changing the rhs
Okay cool thanks, I understand now
a simple example would be a matrix
1 0
0 0
let's pick b = [1;0] first
we have solutions of the form [1;0] + t[0;1], yeah?
then change b to [2;0]
the solutions are now of the form [2;0] + t[0;1]
those two lines are not parallel
hmm
or are they
The vector a moves them right
i guess they are lol
doesn't change the direction
Yeah that's what I was confused about haha, if they're parallel and why
then yeah, parallel lines depending on the value of b
Hmm I always get tired after eating haha
And energetic without eating
Thanks for the help
aight
a line can be parallel to a vector
Ah I see, so a + x_1[5,3]
Ah cool, thanks that's stupid I did not think of that myself haha
A line can be parallel to vector, how could I forget lol
if a tuple is linearly dependant
will other tuples of the same "length" be also linearly dependant ?
tuple meaning (v1,v2,...,vk) where k is the length of the tuple?
(1,0),(2,0) in R^2 are linearly dependent, (0,1),(1,0) are not, providing a counterexample (if I understand you correctly when you say tuple)
oh right
thats what i meant
oh sweet
How do i start this problem
cause I know how to transform a basis into a orthogonal basis using Gram Schmidt
but im not sure how to start this problem
I figured it out
you are assumed not to use gram schmidt tho
The projection can be found using A(A^TA)^-1*A^T * b
hello
if a matrix is found to not have nonzero rows by the time it's reached row echelon form
then we can say for sure that it won't have any by the time it's reached reduced row echelon form right
please ping me for an answer
this is correct
pog
also
the reason we often compute the rank via gaussian elimination
is something to do with how that process makes clear how many rows at most are linearly independent right
@midnight kayak yes. row reduction can be thought of as multiplication on the left my elementary matrices, each of which is invertible.
so say A is an n x m matrix and E is an invertible n x n matrix. the kernel (or null space) of A is going to be the same as the kernel of EA. so by the rank nullity theorem, rank(EA) = rank(A).
this applies directly to row reduction, since if you row reduce and n x m matrix A to its row reduced form A’ by multiplication on the left by elementary matrices, say
(Ek… E1)A = EA = A’
with Ek…E1 = E, then E is invertible so
rank(EA) = rank(A) = rank(A’)
this means that the number of linearly independent rows of A’ is the same as the number of linearly independent rows of A
can someone help with SVM
I don't understand what happens if C is large, I know theres large penalty put into misclassification in this case but how does that affect maximizing the margin?
why will people choose to use small C?
large C means your model will try its hardest not to misclassify anything
which will make it sensitive to outliers or noise in the data
so is overfitting the main problem? or
well that's the big one i can think of rn
and this is pattern recog & ML
pretty good book imo
youre in luck because i happen to have studied SVMs to some extent
but yeah overfitting
the bigger C is, the closer results you get to hard separation (where mistakes are forbidden)
I don't see any other major problem with big C, as overfitting is not too big of an issue in classification, but unless it is computationally harder to do?
like as the constrained minimization problem
and i feel like on the other side if C -> 0 then wouldn't you just be almost always drawing the parameters to 0? so seems pretty boring there
if C is too small you get margins that are too wide and misclassify too much and are essentially meaningless
i.e. w unreasonably close to zero
ic, thanks
actually yea another question was what it meant by "controlling model complexity"
so like if they mean trade-off on the highlighted line there then they implying large C = greater model complexity?
and is that "complexity" in the sense of how hard and/or well numeric solvers will be able to do the minimization?
I think it’s a reference to the fact that larger C will allow your weights to get bigger
large C = more overfitting
complexity there refers to the overfitting. it will prefer curves with large total variation as opposed to smooth ones, since it picks up any random variations as being part of the model
👍thanks
Hey everyone, I have a basic linear algebra question
I am quite confused by this, why is the third side (the hypotenuse) represented by $$\textbf{v} - \textbf{w}$$?
SMILEYYY
make a drawing
let v and w be position vectors pointing from the origin to some point
Ok, give me a sec
V or W could be either two legs of the triangle, yet the length still doesn't satsify the hypotenuse
why do you say that?
v - w here is the vector (-3,2)
what's the length of that vector?
shit
thanks
ohhhh I was thinking like of the length of the vectors (one of the components, because conveniently one of the components is the length of the whole vector), like the length of v - the lengh of w, but in reality you were suppose to think of the vector (including all components)
The real trick is to never use SVM

Also be careful with this I’m GBM overfitting happens a lot even in classification
@zealous junco tho SVM with a guassian kernel can outperform but if its just a linear SVM they generally arent ~great~
how do you prove a transformation is onto?
would you do it by contradiction? show there cant be a vector outside its range or smthn
do you have a specific question?
there probably isn't a completely general method to do so (of course, what you said may always work, but it also may not be the best method)
some ways work better in certain situations, etc
How do you justify that $det(A)= det(A^t)$ if A is an elementary matrix of type 3, that is, adding to any row in the identity matrix of order $n$ another row multiplied by a scalar $k$?
Oh no, it wasn't that
Now it's what I wanted to ask
I understand that for the other two types of elemental matrices this holds as they are symmetric so $A^t = A$, but with the third type $A^t \neq A$ so...
kuro
can you argue that A and A^t both have to have determinant 1
kuro
I had to correct a word
They would both have 1111...*1
Okay
A chain of 1s multiplied until we get to the row or column with an additional k
Oh
And you never multiply by that k
er maybe not always 1
but you should be able to just argue directly by calculating each's determinant
Because in that row/column you just simplify it when applying $\alpha_{ij} = (-1)^{i + j}\cdot det(A_{ij})$, where $A_{ij}$ is the A matrix without the row i and column j
kuro
And as it's the main diagonal, that $(-1)^{i+j} = 1$ always
kuro
kuro
In the sum
So it doesn't have to be always 1
btw
When you calculate it for the transposed matrix, it's the same
The determinant computation will be the same as it will be in both cases:
$1\cdot 1\cdot \dots\cdot (1\cdot \alpha_{ij} + k\cdot \alpha_{pj})\cdot\dots\cdot 1$
kuro
Where $p$ is another row
kuro
And in the column case (which is basically the transposed matrix... I understand what I'm writing, don't worry about this comment)
$1\cdot 1\cdot \dots\cdot (1\cdot \alpha{ji} + k\cdot \alpha{jp})\cdot\dots\cdot 1$
kuro
kuro
Well, it makes sense and I proved it
So that makes me understand everything
Thanks @wintry steppe
i didn't really do much but i'm glad you seem to have got it
I got that question because in my LA book there were proofs about some determinant properties and in order to prove that $det(A) = det(A^t)$ they used that the property holds for elementary matrices, but they didn't justify it and I understood that for types I and II it holds for elementary matrices as they are the symmetric, but as 3rd type wasn't symmetric, it wasn't obvious
kuro
is there any good books on linear algebra which arent boring as hell
nah. all of em are super boring
Two textbooks of mine have conflicting ideas
Tim O'Brien
But also
Tim O'Brien
Does orientation not mean anything? This is quite confusing
the dots represent different ‘multiplications’.
the first dot is the dot product. the second dot is matrix multiplication
that’s what it looks like
so dot product = matrix multiplication theN ?
the standard inner product in R^n is only defined for vectors. not sure how you’re suggesting to interchange them. the entries of matrices after multiplying can be expressed in terms of dot products, which is what the above picture is saying
ok that's quite confusing
in what way
one sec I will show you why it's confusing to me
1 min
actaully nvm
it just works out
1xn nx1 are interchangable b/w dot product and normal multiplication
It doesn't matter which sequence I use, I will get the same result
@forest quiver "normal" multiplication is kind of unhelpful in linear algebra
alright
dot product and cross product help distinguish what's happening
Im getting to linear transformations next section
so I will learn about dot procduct more then
i know it's confusing naming conventions
yeah :(
i got confused too at first
do you think that the first tiem around I go through
I shoudl look for 100% understanding?
no
really?
alright
I will take it with a professor in 2 years as well
so looking forward to that
if you go into other maths, or computer science it has a lot of applications
Its the mos tuseful math I have every done
ever done
the thing I like most about it probably
is what I can do with data
now
it is indeed very useful
like sure you can see vectors as arrows or whatever
but i see them as data points
(data1, data2, data3, etc...O
)
I'm trying to determine a Basis for a subspace whose vectors are all orthogonal to (-1, 1, 1), and my textbook gives the solution B = {(1, 0, 1), (0, 1, -1)}
which is algebraically written as:
0 = (-1, 1, 1) * (a, b, c)
0 = -a + b + c
- c = -a + b
therefore vector forms (a, b, a-b) define a valid basis for W
algebraically, I got:
0 = (-1, 1, 1) * (a, b, c)
0 = -a + b + c
a = b + c
therefore vector forms (b+c, b, c)
is this not also a valid Basis?
@zenith mauve It is also valid yes
let S the set of all numbers such that x,y are an element of the real numbers and x^2 + y = 0 does this form a subspace of R
no right because (1,-1) and (2, 4) are in the set but (1,-1) + (2,-4) = (3,-3) which isnt in the set
am I wrong?
(2, 4) isn't in the set
oh u meant (2, -4)
uh but then you should get (3, -5) from the sum
but the argument still works
:catThink:
Yea meant (2,-4) lmao so it’s not a sub space right cause you get (3,-5) which isn’t in the set @wintry steppe so it’s not closed under addition
Yea sorry
Was doing a brief refresher and wanted to make sure
also I stumbled upon this question am I bit confused
why is that not closed under addition here
im a bit confused as to what it means by not in the form
to be in the desired form means that it's a quadratic in t with leading coefficient 1 and linear coefficient 0
the thing they got has leading coefficient 2
im not sure how to start this
my thought was to isomorph a basis for Ker(T) to get a basis for null(A) and show the nullity is the same, then use the dimension theorem to show the rank must be the same too but idk if thats the best way
coycoy
arent we given that with $A=[T]_\beta^\gamma$?
taxminion
if you know that $[T]_{\beta}^{\gamma}$ is already equal to the composition above, then what i have told you is kind of pointless
coycoy
otherwise it’s kind of helpful
i think im a bit confused on the proof in general bc it seems like the matrix being equivalent to the transformation is such a basic fact idek how to start proving it
i mean the proof is about the rank/nullity specifically but it just seems too obvious to me idk
it is pretty obvious, but if you want to prove it you need to use isomorphisms
ig. but it probably doesnt help im not super sure how to prove a transformation has a certain rank or nullity in general
my instinct is to start with the dimension of a basis for the null space but idk
you have to show that the rank of the composition of the isomorphisms (or their inverses, rather) and T has the same rank as T
they did tell you that A = [T]_\beta^\gamma, but nothing about how exactly A is made from T, which is rather trivial, but necessary
your instinct was sort of right. but you are given a basis already, so i don’t know how choosing one is really going to weave into this proof.
once you have the fact that i have given above, then you can show that $\iota_{\mathbf{v}}$ when restricted to the null space of A is an isomorphism between null(A) and the kernel of T
coycoy
yeah doing an isomorphism between null(A) and ker(T) is where i went originally. i got bogged down trying to prove that the dimension was preserved but maybe i didnt need to since its an isomorphism
if U and V are vector spaces over some field F, then U is isomorphic to V if and only if dim U = dim V
right but can i say for sure that the isomorphism of a basis for ker(T) will be a complete basis for null(A)?
no, you just don’t need that fact, unless you are actually trying to show this as a preliminary lemma:
if U and V are vector spaces over some field F, then U is isomorphic to V if and only if dim U = dim V
maybe im just not understanding what you mean. i thought we were trying to prove that the dimension of null(A) and ker(T) are the same, so how could i use that? or do you mean its using that null(A) and ker(T) are isomorphic to prove they have the same dimension?
i’m trying to guide you to an isomorphism between nul(A) and ker(T). by showing that, then dim null(A) will be equal to dim ker(T)
right thats what i meant
so how's this
suppose $\operatorname{nullity}(T)=k$ and a basis for $\ker(T)$ is $\eta={v_1,\ldots,v_k}$.
let $\phi_\beta:V\to F^n$ be the isomorphism such that $\phi_\beta(\beta_i)=e_i$ for $1\leq i\leq n$.
since $A=[T]\beta^\gamma$, $\phi\beta(\ker(T))=\operatorname{null}(A)$, making them isomorphic. therefore, $\dim(\ker(T))=\dim(\operatorname{null}(A))$ and $\operatorname{nullity}(T)=\operatorname{nullity}(L_A)=k$.
since the dimension of the codomain of $L_A$ and $T$ are equal, by the dimension theorem, their ranks must be equal as well. qed
taxminion
actually about my SVM question yesterday, why in soft SVM that all non-support vectors are correctly classified points?
or "most"
what are non-support vectors
Those with Lagrange coeff that’s zero
Like the coeff in the complementary slack condition
BTW guys I am learning Linear Algebra on my own and I have finished reading and solving exercises of the books Linear Algebra by Sheldon Axler and Steven Roman. What book for Linear Algebra do you recommend for reading next? I am not looking for something that is much more advanced, but I am looking for a book that links linear algebra and other fields of math, like Graph Theory + Linear Algebra or something. If you know good books that combine two or more fields of math, please tell me. Thanks.
i'd have to look at the cost function to answer you more clearly, but off the top of my head, it has to do with the interpretation of complementary slackness
having a "lagrange coeff" (not really lagrange, these are from KKT) of 0 means the point is already feasible
you need mu_i * g_i(x) = 0, yeah? for conditions g_i(x) <= 0
if you are on the boundary of the feasible set, i.e. when g_i(x) = 0, then you can penalize this by having a nonzero cost, a mu_i different from 0
if you're inside the feasible set, g_i(x) is strictly smaller than 0, and then there is no penalty for this term
so you set mu_i = 0
so if the point has a weight mu_i of 0, it means the inequality was already satisfied
which in your case presumably means the point was on the correct side of the hyperplane
waving my hands wildly
See that I'm reading a linear algebra book and I'm not going to read another, and I know Axler. It's difficult, so I won't read him until I finish a lot of books.
yea thanks i got it, sort of
my understaning of hard SVM (linearly separable data) is fine i think, its that they normalize so w^t phi(x) ≥ 1 and equality is achieved whenever x is a point on the margin, and so if the KKT coefficients mu ≠ 0, i.e. x is a support vector, then it must be on the margin and contrarily if w^t phi(x) > 1, i.e. a vector not on the margin then mu = 0, i.e. x is not a support vector
but I think im not perfectly understanding soft SVM yet since you have the slack variables so could it ever happen that mu = 0 but the x is classified incorrectly here..
not, that's the point of complementary slackness
well
if you are using interior point methods, at least
for soft svm, this would be like being past the hyperplane, but still being within some distance of it
if you're on the correct side of the hyperplane, mu = 0
if you're on it or on the wrong side, mu gets larger
the inequality constraint is applied to a function of the vectors that is not just dependent on the hyperplane and the point, but rather some transformed distance
soft svm tries to minimize the number of incorrectly classified points and how far away they are from the hyperplane
the only thing you have to look at is the inequality constraint, really
yea thanks i sort of get it now
thanks for explaining, i guess the only main difference here is the support vector no longer required to live on the margin, i.e. when mu = gamma (gamma being the parameter)
but it remains true that those points s.t. mu = 0 is not contributing to making prediction since they are better classified than the support vectors
since for those points we have tn*yn ≥ 1- xi which means it is at least better than equality
that depends on the cost function
we were only discussing the inequality constraints
the original optimization target may include all the points
the inequality constraints only involve misclassified points due to slackness when written in KKT form
yeah, remember you're optimizing some f(...) + sum (mu_i * g_i(x))
There's an exercise in a book that says:
-
$A$ is symmetric $\longleftrightarrow$ $A^{-1}$ is symmetric.
-
$A$ is antisymmetric $\longleftrightarrow$ $A^{-1}$ is antisymmetric.
They show that:
$A$ is symmetric $\longrightarrow$ $A^{-1}$ is symmetric because $A = A^t$ and $(A^{-1})^t = (A^t)^{-1} = A^{-1}$ so $A^{-1}$ is symmetric.
$A$ is antisymmetric $\longrightarrow (A^{-1})^t = (A^t)^{-1} = (-A)^{-1} = -A^{-1}$ and $A^{-1}$ is antisymmetric.
And finally the book says that the reciprocal is obtained bearing in mind that $A = (A^{-1})^{-1}$, but I don't get it
kuro
I get that if $A^{-1}$ is symmetric then $A^{-1} = (A^{-1})^t = (A^t)^{-1}$
kuro
kuro
But I didn't apply that $A = (A^{-1})^{-1}$ as the book says
kuro
Well, and in the antisymmetric case
If $A^{-1}$ is antisymmetric, then $(A^{-1})^t = (A^t)^{-1} = -A^{-1}$ and $((A^{-1})^t)^{-1} = ((A^{-1})^{-1})^t = A^t$
kuro
And $-(A^{-1})^{-1} = -A$
kuro
kuro
Well, we should have that if $A^{-1}$ is symmetric, then $A^{-1} = (A^{-1})^t = (A^t)^{-1}$ and $((A^{-1})^t)^{-1} = ((A^{-1})^{-1})^t = A^t$ and as $A^{-1} = (A^{-1})^t$ then we should have that $A = A^t$
kuro
So then $A$ is symmetric
kuro
Okay, I don't have more questions

i see you've taken the daminark approach to problem solving
Well, before just asking for a solution, I wanted to be sure I wasn't understand what was following... but then I noticed I could deduct it and I finally got it the way the book wanted
And now the book gave an exercise that was
Show that the product of 2 lower triangular matrices is a lower triangular matrix and that the product of 2 upper triangular matrices is an upper triangular matrix and then deduct a formula for the powers of a diagonal matrix.
The book doesn't give you how to get a formula for the powers of a diagonal matrix, but I observed you can get by induction
That if $A\in\mathfrak{M}_n(\mathbb{K})$, then $A = \begin{pmatrix}
d_1 & 0 & \dots & 0 \
0 & d_2 & \dots & 0 \
\vdots & \vdots & \ddots & \vdots \
0 & 0 & \dots & d_n
\end{pmatrix}$ and
$A^p = \begin{pmatrix}
d_1^p & 0 & \dots & 0 \
0 & d_2^p & \dots & 0 \
\vdots & \vdots & \ddots & \vdots \
0 & 0 & \dots & d_n^p
\end{pmatrix}$ where $p\in\mathbb{N}$
kuro
And you get that the product of two diagonal matrices is a diagonal matrix because as a diagonal matrix is both an upper and lower triangular matrix, and knowing that the product of two upper triangular matrices is another upper triangular matrix and the product of two lower triangular matrices is another lower triangular matrix, then the product of two diagonal matrices must be a diagonal matrix
And then you see it's easy to prove by induction this
This is very funny
Who is Daminark?
moderator here who frequently makes long posts to channels to solve their problems out loud / have gomez come in and solve for them
Oh
so @lavish jewel your saying that here, whenever a_n=0 i.e. the constraint is not activated, it means the t*y(x)-1+xi is already in the correct side of the margin/good enough s.t. its misclassification error can be ignored?
yea
Honestly I don’t do that but I have found that sometimes asking questions in channels helps cause it forces me to frame the problem
Yes
also is it true that whenever xi_n is not 0 (the point x_n is beyond the margin) then the kkt constraint is active? otherwise i feel like its not achieving the minimum?
i.e. is it possible for points x_n where the inequality constraint is not active to have positive xi_n
i would hope not, otherwise kkt wouldn't work
this is all just the definition of complementary slackness
Previous test question that I messed up previously but think I have the answer, give me a minute to type it out
$\langle (I-T^2)[w],w\rangle=\langle w,w\rangle - \langle T^2[w],w\rangle \ =\langle w, w\rangle - \langle T[w],T[w]\rangle$ So want to show that $\langle T[w],T[w]\rangle \geq \langle w,w\rangle$
Mosh
actually just gonna write it out
Has anyone worked with the Vandermonde determinant?
@wintry steppe not much but i used it
iirc it is something with rows as
x^n x^(n-1) ... 1
?
The proof of the formula for $V_n$... it's being like a pain in the ass for me
kuro
,w Vandermonde determinant
so yes
Yes, that is the determinant
I still haven't worked with Jordan expansions
When I studied linear algebra I didn't understand most of the course and this summer I'm trying to learn everything
I almost finished the matrices part and I was doing the book exercises but the Vandermonde determinant is very difficult to work, btw, I need to keep working with it because I must be able to prove it by myself or I won't learn much linear algebra
wait how you calculate determinants then
oof
sorry
With the Laplace expansiom
i meant laplace
Laplace and properties
I know you start by doing what you do in the Gauss method to make the first column $\begin{pmatrix}
1 \
0 \
0 \
\vdots \
0 \
\end{pmatrix}$
Which doesn't change the determinant
kuro
I have to give more context: In my book they give the transpose matrix of that
ah
well t hen yes
use that addition of multiple of column to column does not change determinant
Well, it's the same determinant as $det(A) = det(A^t)$
kuro
basically
if you do this
then determinant would be expressible in terms of determinant for smaller matrix
Yeah, I was trying to do that
Okay
I got it
By induction
But you should do like something weird by columns
not really weird
In order to get the $V_{n-1}$
kuro
I'm not used to work with columns, just rows
Then you say that the determinant is that product by hypothesis
And you get it
Well
I'm not at home right now so I can't write it down
Yes. I proved that yesterday
then use it if you are not comf with column ops
Yes, I think I'll use that property, so it's easier for me
Thanks @dire thunder , you are nice
yw
without knowing what "your parabola" is, no
this probably belongs in #prealg-and-algebra or #precalculus, not in #linear-algebra
Where does linear dependence come from ? Like it's a non trivial solution
How is that related with dependence?
Or how should I actually make some sense of that twrm
a set of vectors is said to be dependent if they span to the 0 vector non-trivially
The point of 'dependence' is that if, say, vectors $v1,\dots,v_n$ are linearly dependent then there exist $\alpha_1,\dots,\alpha_n$ not all zero such that $ \alpha_1 v_1 + \dots + \alpha_n v_n = 0$ and so if (wlog) $\alpha_1 \ne 0$ then we may write $v_1 = -\frac{1}{\alpha_1}(\alpha_2 v_2 + \dots + \alpha_n v_n)$
nuclearpotat
so we can see that the vector v1 in some sense is depnedent on the others as whenever we write v1 we could instead write some combination of all the others
if you think of it visually it helps a ton
all the formal stuff comes easily after (at least it a little)
the formal is necessary for higher dimensions so you can't skip it
Hi. I have a question and I'll describe it a lil bit long. So I got a question like this:
and here's a text solution I have solved myself
Now I have to imply this into Matlab. Here is what I've done so far:
in which: f is the matrix where I write down the transformation f in matrix form, ff is the first column of f and temp is the first row of E, transposed. The terminal looks like below:
What I want to do is to do the exact same thing in the text solution: I'll take the first row of E, then substitute those values into the matrix f, by column, respectively x1, x2 and x3 (to both of its columns), then calculate the new pair, just like this:
And repeat for the following two rows to get 2 more pairs.
As in the terminal result, I got an error. May I ask what cause the problem and how do I solve it? Or do you have another "smarter" solution for this particular question?
Thank you in advance.
i messed up somewhere, not sure where could someone help?
i mean it's simple to test u2+u4 to see if you get the first row value for v correct (which it doesn't add up here)
Hello! I've created my vector and parametric equations, but I don't know what to do from there. How do I solve this sort of question? No answer needed!
@last holly i used the formula v x u1 /u1 x u1. is this not correct?
i mean formulas still have to add up in the end
i just picked u2 and u4 because they have 0's so it's easier to check
ive been redoing it for a while now and its still not adding up
(in a test you could verify quickly this way)
it didnt add up
but im not sure where i messed up because i recalculated and double checked?
are you applying it correctly?
i can show u my work @last holly
Hi



ill think about it later, thanks