#numerical-analysis | Mathematics | Page 21

wide spear Apr 5, 2021, 3:22 AM

#

Yeah

#

Yes there is

#

torch.distributions or scipy.stats might have some things

brave crypt Apr 5, 2021, 3:26 AM

#

there's an analytical solution? i'm looking at the wikipedia page, and i can't imagine it

wide spear Apr 5, 2021, 3:26 AM

#

I think the idea is to find the best zipfian distribution to model the data then see if it's good

#

Deciding yes/no won't be a quantitative thing though

brave crypt Apr 5, 2021, 3:31 AM

#

Yeah, and i cannot imagine doing logarithmic derivative of this thing and setting to 0 is analytically solvable

wide spear Apr 5, 2021, 3:32 AM

#

Numerics!

#

What is a bearlain

brave crypt Apr 5, 2021, 3:38 AM

#

this sounds like a fun problem to explore whether this is convex, or a region near the minimum can be found that is convex, etc. 🙂

wide spear Apr 5, 2021, 3:40 AM

#

You can make it into a convex optimization problem without a doubt

#

Regression can always be made into a convex problem, no?

brave crypt Apr 5, 2021, 3:41 AM

#

i think there is a result like regression with exponential families is always convex or something

wide spear Apr 5, 2021, 3:41 AM

#

Well

brave crypt Apr 5, 2021, 3:42 AM

#

but not in general

wide spear Apr 5, 2021, 3:42 AM

#

Specific types

#

Like if the norm is convex

#

Because then you're just minimizing the sum of convex functions right?

#

A zipfian distribution is probably convex

#

But I don't think it belongs to the exponential family

#

Wait you can just do gradient descent on the sum

#

This is definitely in scipy

#

Stochastic gradient descent

#

stare

brave crypt Apr 5, 2021, 3:48 AM

#

and if scipy does some weird ass way to optimize it then you could try coding it yourself tinktonk

wide spear Apr 5, 2021, 3:49 AM

#

I have the person formerly known as hitbox noted as "cannot code"

#

I wonder why

#

Rip

#

Ummm

#

Probably something from last year

#

Or something

#

I don't remember

#

I can try to dig it up if you want

brave crypt Apr 5, 2021, 3:50 AM

#

i know hitbox as the guy who studies probability

wide spear Apr 5, 2021, 3:50 AM

#

This is true

#

hitbox does do probability

#

Oh

#

Maybe it was because of your sage difficulties

#

Oh in January

#

A Dembo

#

Math 136

#

Rip

#

giggleCat

brave crypt Apr 5, 2021, 4:26 AM

#

I think there are some results like MLE is fucked if it happens on the boundary

#

Or maybe it was more like MLE is provably good only when it happens in the interior, something like that

#

But this is sort of more mathematical rather than statistical

#

The result I recall is about MLE, it probably exists in some mathematical text like shao's mathematical statistics 😜

wide spear Apr 5, 2021, 4:29 AM

#

giggleCat

brave crypt Apr 5, 2021, 9:33 AM

#

hardwiring is always bad:(

glossy yoke Apr 5, 2021, 11:33 PM

#

or maybe i can just handwave it away
it's probably fine
Famous last words

wide spear Apr 5, 2021, 11:33 PM

#

stare

glossy yoke Apr 5, 2021, 11:33 PM

#

Unless you’re doing ML, as ML researchers aren’t held to any standards of statistical rigor

wide spear Apr 5, 2021, 11:34 PM

#

Everyone knows that the best way to get state of the art results is by adding more layers to your neural network

glossy yoke Apr 5, 2021, 11:35 PM

#

Adding layers is for schmucks

#

Real DL researchers use random seed tuning

brave crypt Apr 6, 2021, 5:29 AM

#

@wide spear there is an issue with bc actually 😮

#

Screen_Shot_2021-04-06_at_10.52.37_AM.png

brave crypt Apr 6, 2021, 6:24 AM

#

Imagine the shitshow when it's revealed that all these ML advances were actually bullshit

fleet sail Apr 6, 2021, 12:19 PM

#

anyone know of application of nonorthogonal projections stare

wide spear Apr 6, 2021, 2:13 PM

#

brave crypt

catThink

brave crypt Apr 6, 2021, 3:13 PM

#

wide spear <:catThink:519997422194393108>

now is better, but sth still wrong xD

wide spear Apr 6, 2021, 3:15 PM

#

giggleCat

brave crypt Apr 6, 2021, 3:16 PM

#

kekw

#

#

its either nonsmooth like here

#

or red area everywhere then sudden drop on the rhs to blue color...

wide spear Apr 6, 2021, 3:21 PM

#

catThink

#

Perhaps you have some right end boundary condition issues

brave crypt Apr 6, 2021, 3:28 PM

#

thats what my prof told me

#

to apply neumann everywhere but rhs

#

rhs must be dirichlet for uniqueness of the solution

turbid jay Apr 7, 2021, 12:38 PM

#

hey, I don't understand this mini-proof which is about the Polyak step size for convex optimisation. x* is the optimum, the function is beta-smooth. In particular, I don't get the last two steps. It's from this paper (last page): https://arxiv.org/pdf/1905.00313.pdf

eta is the polyak stepsize: $\frac{h_t}{norm(\nabla_t)_2^2} = \frac{f(x_t)-f(x^*)}{norm(\nabla_t)2^2}$ and the update step is simply: $x{t+1} = x_t - \eta_t \nabla_t$ . if I plug this in the third line, I get something similar than the second last line. And I'm clueless how they derived the last line

pine jettyBOT Apr 7, 2021, 12:39 PM

#

lyinch

turbid jay Apr 7, 2021, 12:44 PM

#

Maybe I should also clarify what smooth means:
$f(y) \leq f(x) + \nabla f(x)^T(y-x) + \frac{L}{2} norm{x-y}_2^2$ . It's a quadratic bound on the growth of a function (and not the usual definition)

pine jettyBOT Apr 7, 2021, 12:45 PM

#

lyinch

brave crypt Apr 7, 2021, 12:46 PM

#

Are there bounds on eta?

#

I wonder if the last line is eg optimizing in eta or something

turbid jay Apr 7, 2021, 12:48 PM

#

not that I know of, but I haven't read the full paper. I'm only looking for one specific thing for my own proof... And for the polyak stepsize that we saw in class there are no bounds. It's just this fraction that I wrote

#

however, it depends on the gradient of f(x_t) . And this gradient is limited by the convexity and the smoothness, so maybe that's a way to derive some bounds

brave crypt Apr 7, 2021, 12:51 PM

#

Ah I think if is

#

If eta equals that fraction then it is bigger than 1/beta, by ht >= blah at the top of the image you posted

#

So if eta is bounded above by something reasonable then the minimum is achieved at 1/beta

#

Which gives the last line

turbid jay Apr 7, 2021, 12:54 PM

#

brave crypt If eta equals that fraction then it is bigger than 1/beta, by ht >= blah at the ...

oh... I totally overlooked this condition! I need a second to write this down and try it out

#

thank you, I got it! This seems to work out nicely

#

now I hope that I can use that result in my own exercise 🙂

turbid jay Apr 7, 2021, 2:02 PM

#

I don't get it how they derive the condition $h_t \geq \frac{1}{\beta} norm(\nabla f(x_t))_2^2$ which is the prerequisite for case 3 . That's supposedly a general result for a smooth convex function, and not specific to the Polyak step size

pine jettyBOT Apr 7, 2021, 2:03 PM

#

lyinch

fleet sail Apr 7, 2021, 5:58 PM

#

id like to modify this function so it doesnt explicitly compute Q and saves more time in general, how do i do that? This is QR decomposition using householder

Screen_Shot_2021-04-08_at_1.57.19_AM.png

wide spear Apr 7, 2021, 6:00 PM

#

You want to do a QR decomposition without computing Q?

fleet sail Apr 8, 2021, 2:07 AM

#

im guess i just want a more efficient way to do this, right now im applying the householder H to M = [A I] to get the result (Hn...H1)M = [R Q^T] but for instance if im solving a least square problem Ax = b then really i just need to do back substitution on Rx = Q^Tc so is there a way to not compute explicitly the Q or shorten the process somehow to solve that system

wide spear Apr 8, 2021, 2:08 AM

#

What is the goal

#

To compute QR decomp?

fleet sail Apr 8, 2021, 2:08 AM

#

the goal is to use QR decomp in least square

#

so i got this block that uses the QR decomp

Screen_Shot_2021-04-08_at_10.09.11_AM.png

#

ignore the first 2 block of code really they are to identify errors

wide spear Apr 8, 2021, 2:10 AM

#

Wait so you have a least squares problem min norm(Ax-b)

#

And you do QR on A to get A=QR

#

So you know that Rx=Q^Tc

#

So you still explicitly need Q?

#

Am I misunderstanding

fleet sail Apr 8, 2021, 2:11 AM

#

yea that is the idea

#

but apparently this

Screen_Shot_2021-04-08_at_10.11.36_AM.png

wide spear Apr 8, 2021, 2:12 AM

#

Oh

#

So instead of performing Q=H_1H_2...H_n

#

You just store what they are

#

So when you need to compute Qv

#

You just do all the matrix multiplies then

fleet sail Apr 8, 2021, 2:13 AM

#

in what way is this faster though

#

or less expensive whatever that means

wide spear Apr 8, 2021, 2:13 AM

#

Well the idea is that if you don't care about Q you save some time during the QR decomp

#

But then you need to do those computations once you compute Qv

#

So it saves time if you never compute Qv

#

Which is very ?????

fleet sail Apr 8, 2021, 2:15 AM

#

hm

#

so it do look like these all must be computed

#

and theres no large place where i can optimize it significantly

wide spear Apr 8, 2021, 2:26 AM

#

Yeah

#

Shockingly, when you compute a QR decomposition, you compute Q and R

fleet sail Apr 8, 2021, 2:26 AM

#

sadcat thx

wide spear Apr 8, 2021, 2:26 AM

#

giggleCat

fleet sail Apr 8, 2021, 2:26 AM

#

very shocking

brave crypt Apr 8, 2021, 12:02 PM

#

@wide spear

#

oooooooFFFFFF

pine jettyBOT Apr 8, 2021, 1:22 PM

#

whzup

fleet sail Apr 8, 2021, 1:34 PM

#

oh i see

#

ok thx!

wide spear Apr 8, 2021, 2:10 PM

#

I see

#

That’s really something

brave crypt Apr 8, 2021, 4:53 PM

#

https://youtu.be/K8NCa93fGiw
@wide spear @warm otter

YouTube

Rauan K

Unstructured 2D Chorin Projection Method

Solving Navier-Stokes equation until convergence
Left inlet u = 1; v = 0;
Right outlet du/dn = dv/dn = 0
Pressure boundaries Neumann everywhere but the right p = 0

▶ Play video

wide spear Apr 8, 2021, 4:54 PM

#

Oh my

brave crypt Apr 8, 2021, 4:54 PM

#

PERFECTOOO

#

the prressure seems to drop for some reason in the middle

#

but the solution is DAMN SMOOTH

wide spear Apr 8, 2021, 4:54 PM

#

Nice

brave crypt Apr 8, 2021, 4:54 PM

#

❤️

brave crypt Apr 9, 2021, 4:07 AM

#

I'm stuck on this problem currently.

wide spear Apr 9, 2021, 4:13 AM

#

What did you try

brave crypt Apr 9, 2021, 5:26 AM

#

Got it with some help wasn’t sure which channel this problem went into

prime kraken Apr 9, 2021, 5:37 AM

#

can someone give me a hand with proximals and the jacobi method?

#

i'm trying to figure out one line in a paper where they give a result without saying anything other than they used those two concepts

#

#

specifically those scalings by the squared column norms of the matrix A

prime kraken Apr 9, 2021, 8:11 AM

#

nvm i got it

tall solar Apr 9, 2021, 10:32 AM

#

Anyone ever seen a paper that defines proximal operators on piecewise functions that are differentiable at the nodes?

azure shuttle Apr 9, 2021, 2:40 PM

#

I think this question belong here? it is about kaczmarz method for least square which is not common but wonder if someone can help. My question is quite simple

wide spear Apr 9, 2021, 3:00 PM

#

What is the question

azure shuttle Apr 9, 2021, 4:12 PM

#

I know it can be used to solve Ax = b if A is overdetermined where there is a solution x, but how can it be translated to least square directly? like if there is no solution x to Ax = b. Someone said to just do x* = arg min norm(Ax - Ax* + Ax* - b) and then solving Ax* = b using kaczmarz. How does that produce the least square?

#

slides https://www.math.ucla.edu/~deanna/talk1.pdf

#

for now I can only see this as order n^2 way of approximating linear system

wide spear Apr 9, 2021, 4:17 PM

#

https://arxiv.org/abs/1205.5770

arXiv.org

Randomized Extended Kaczmarz for Solving Least-Squares

We present a randomized iterative algorithm that exponentially converges in
expectation to the minimum Euclidean norm least squares solution of a given
linear system of equations. The expected...

#

This seems to be what you want

#

There's also the dumb idea

#

Where you just solve the normal equations using Kaczmarz

#

But that probably isn't what you want

azure shuttle Apr 9, 2021, 4:24 PM

#

oh this paper has what I am looking for

#

thank you!

#

it seems regular kaczmarz has no guarantee to converge to least square as expected

willow nebula Apr 10, 2021, 11:34 PM

#

find the interpolating polynomial of this function through the 5 Chebyshev nodes.
[6:20 PM]would i plug in the cosodd#*pi/2n nodes into my original f(x) for the x?

if i had 1/(1+25x^2) would I plug in the cos(stuff) into the x

to get the y for the (x,y)

#

i think im overthinking it too hard

wide spear Apr 10, 2021, 11:37 PM

#

What

#

To do polynomial interpolation

#

You calculate the function at some points

#

In this case the Chebyshev nodes

#

Then you do polynomial interpolation with those points

#

So you should get 5 (x,y) pairs

#

And then you should get a degree 4 polynomial out of the interpolation

willow nebula Apr 10, 2021, 11:40 PM

#

yea

#

so i plug in the cos(stuff) into the original f(x) we're given which is 1/(1+25x^2)

wide spear Apr 10, 2021, 11:40 PM

#

Ok so what function are you evaluating

#

f(x)=1/(1+25x^2)

#

Ok

#

What are the 5 chebyshev nodes

willow nebula Apr 10, 2021, 11:41 PM

#

cospi/10 cos3pi/10 cospi/2 cos7pi/10 cos9pi/10

#

this is all on [-1,1]

wide spear Apr 10, 2021, 11:42 PM

#

Ok

#

There you go

#

Those are the 5 x values

willow nebula Apr 10, 2021, 11:42 PM

#

ok

#

its just the coeffs i got for the int poly are rly ugly fractions ><

#

LMAO

wide spear Apr 10, 2021, 11:43 PM

#

That's fine

willow nebula Apr 10, 2021, 11:43 PM

#

but i plugged those nodes into f(x) & did the process so ig i did it right just big numbers r scary

wide spear Apr 10, 2021, 11:43 PM

#

Ok

azure shuttle Apr 11, 2021, 3:27 PM

#

fleet sail im guess i just want a more efficient way to do this, right now im applying the ...

Just wanted to add on that I found another optimizing method

wide spear Apr 11, 2021, 3:27 PM

#

catThink

azure shuttle Apr 11, 2021, 3:31 PM

#

Each householder can be applied to A to produce R while avoiding matrix-matrix multiplication, and instead you can iterate over columns c_i of A by c_i - 2(u^T*c_i)u --> c_i

#

Notice each time you do this it is ≤4*n flops

#

Because each hh is $H = I - uu^\top$, and so you can choose instead of naively do $HA$ but instead apply to each column by $Hc_i = c_i - (u^\top c_i)u$

pine jettyBOT Apr 11, 2021, 3:38 PM

#

eonian

wide spear Apr 11, 2021, 3:39 PM

#

Oh nice

#

Well

#

Yes

#

The householder matrix is low rank

#

And doesn't have a lot of information

azure shuttle Apr 11, 2021, 3:51 PM

#

I implemented one in python that uses both of this just now, I was interested and want to practice

#

The overall is O(n^2m) for solving the minimization

wide spear Apr 11, 2021, 3:53 PM

#

n^(2m) or (n^2)m

#

Parentheses are important

azure shuttle Apr 11, 2021, 3:53 PM

#

The latter, the former would be crazy

wide spear Apr 11, 2021, 3:53 PM

#

Yes

azure shuttle Apr 11, 2021, 3:55 PM

#

No , I think anticipation has it above

craggy mauve Apr 11, 2021, 4:02 PM

#

henlo, my professor left as an exercise to my class to estimate a value for pi using the monte carlo method (drawing a unit circle inside the unit square, distributing n points inside the square and counting those who fall inside the circle)

the catch is that we were also asked to provide a thought process for choosing n such that our approximation is accurate to 0.05%

what does it even mean to be accurate by 0.05%?
any tips on how to choose n? of course, without using the try and error approach

wide spear Apr 11, 2021, 4:05 PM

#

It means that your estimated value of pi is within 0.05% of the true value of pi

#

You should have some theorems giving you convergence bounds

craggy mauve Apr 11, 2021, 4:35 PM

#

well talking about precision in percentage is still a bit confusing but i'll get the idea eventually

#

i checked the reference textbooks for this class and couldnt find anything useful on convergence

#

do you know any references (links, books etc) i could use?

wide spear Apr 11, 2021, 4:37 PM

#

Central Limit Theorem

#

You might want to ask about this in #probability-statistics

fleet sail Apr 12, 2021, 12:49 AM

#

wow i think i see what u mean

fleet sail Apr 12, 2021, 12:50 AM

#

azure shuttle Because each hh is $H = I - uu^\top$, and so you can choose instead of naively d...

thx! this is pretty good

tall solar Apr 13, 2021, 1:32 AM

#

Anyone know where to find a rigorous proof that l1 minimization enforces sparsity?

brave crypt Apr 13, 2021, 1:46 AM

#

Maybe this is really just a proof of lagrange multipliers

#

"enforces" sparsity is sort of a too strong word, more like pushes towards sparsity. I think a formalization of what this even means will probably just come down to the fact that a ball in l1 is pointy

wide spear Apr 13, 2021, 2:00 AM

#

Hmmmm

#

This reminds me of the Pontryagin Maximum Principle

#

But this is a completely different setting

#

Or

#

Maybe not so much

#

Well

#

It's optimal control theory

prime kraken Apr 13, 2021, 6:09 AM

#

@tall solar most proofs hinge on the strong equivalence of the l0 pseudo norm and the l1 norm

#

if the l0 version of the problem has a unique solution, the convex relaxation, which is the l1 norm (the convex hull of the l0 one), finds the same solution

#

so they usually work with the "spark", "girth", or "kruskal rank" of the matrix in the noiseless case, or with the null space property or restricted isometry property for stable recovery with "noise"

turbid jay Apr 13, 2021, 8:16 AM

#

how can I prove that $\log(\sum_i \exp(a_ix+b_i))$ is convex? I guess I could use Hölders inequality somehow, but I don't know what to do with the affine function in the exponential.

pine jettyBOT Apr 13, 2021, 8:18 AM

#

lyinch

turbid jay Apr 13, 2021, 8:23 AM

#

this might sound dumb... But I could just extend the dimension (homogeneous coordinates?) by one and use $x' = [x,1]$ and $a_i' = [a_i,b_i]$ and then I have a new identical function: $\log(\sum_i \exp(a_i'x'))$ in $d+1$ dimensions on which I can apply Hölders inequality. Would that work?

pine jettyBOT Apr 13, 2021, 8:23 AM

#

lyinch

brave crypt Apr 13, 2021, 8:39 AM

#

i'm not sure about using holder's inequality, but i think taking a second derivatives is not that bad, and it reduces to cauchy's inequality

fleet sail Apr 13, 2021, 8:57 AM

#

brave crypt i'm not sure about using holder's inequality, but i think taking a second deriva...

I don't think its quite cauchy because i tried and got (a_1^2x_1+...+a_n^2x_n)(x_1+...+x_n) - |<a,x>|^2

#

unless i did it wrong or didnt see something

#

where x_1 = e^(a_1x+b) or whatever

brave crypt Apr 13, 2021, 9:01 AM

#

yeah this is cauchy with (a1 sqrt(x1), ..., an sqrt(xn)) and (sqrt(x1), ..., sqrt(xn))

fleet sail Apr 13, 2021, 9:02 AM

#

ah i c ic

turbid jay Apr 13, 2021, 9:23 AM

#

I think I got it via Hölder's inequality, but it's an ugly mess. I'll now try the second derivatives as you mentioned

turbid jay Apr 13, 2021, 9:43 AM

#

that's the monstrosity I did

#

1: I just substitute the exp() to make it easier to read and in 2 I apply Hölders inequality: $\sum_i^n \norm*{x_iy_i}_1 \leq \left( \sum_i^n \abs{x_i}^p \right)^{\frac{1}{p}} \left( \sum_i^n \abs{x_i}^q \right)^{\frac{1}{q}}$ where $\frac{1}{p} + \frac{1}{q} = 1$ with $\frac{1}{p} = \lambda$ and $\frac{1}{q} = (1-\lambda)$, $0 \leq \lambda \leq 1$

pine jettyBOT Apr 13, 2021, 9:44 AM

#

lyinch

brave crypt Apr 13, 2021, 9:50 AM

#

nice. oh, is x a (d dimensional)vector?

turbid jay Apr 13, 2021, 9:51 AM

#

euhm yes, I probably should have given more context 🙄

brave crypt Apr 13, 2021, 9:51 AM

#

ah, i guess 2nd derivative/hessian will be a mess then

turbid jay Apr 13, 2021, 9:51 AM

#

f is from R^d -> R, and a_i^T are the rows of a matrix of mxd

#

that was apparently the easy part, now I have to prove smoothness and then that there's a minimizer 😄 Thanks for your help!

prime kraken Apr 13, 2021, 10:09 AM

#

i think the path they expected you to use was that exp is convex, composition with affine mappings is convex, nonnegative weighted sums preserve convexity, and composition with a monotonically increasing scalar func (log) is convex

#

if you've already shown that, anyway

#

because showing those things independently is a lot easier than going HAM with the hessian here

#

😛

turbid jay Apr 13, 2021, 10:16 AM

#

I think we've only shown that composition of convex function with an affine function preserves convexity... But I agree, maybe I could show the other properties

prime kraken Apr 13, 2021, 10:17 AM

#

these are classic "cookbook" properties, you could probably google the proofs

#

this is the first thing i found, it has some short handwavy proofs

#

https://see.stanford.edu/materials/lsocoee364a/03ConvexFunctions.pdf

brave crypt Apr 13, 2021, 10:38 AM

#

ah, i was gonna say that is smarter. but can we apply it actually? f(x)=h(g(x)), and g must be convex and h must be convex and non-decreasing

prime kraken Apr 13, 2021, 10:41 AM

#

maybe it's one of those log-convex instead of using just composition

#

i don't recall which one exactly it is

#

hmm seems i was mistaken about the last step, you have to show it's log convex indeed

fleet sail Apr 13, 2021, 12:52 PM

#

Does conjugate gradient have a chance to fail if (A^T)A is only pos semidefinite?

#

because maybe the line search doesnt yield a min point and instead gives inflection point, but perhaps that is really rare?

#

fail in the sense that it doesn't converge in dimV iteration

prime kraken Apr 13, 2021, 1:24 PM

#

inflection points are indefinite

#

semidefinite means it's rank deficient and has infinitely many solutions

#

you'll find one of them, but i'm not sure what happens if the observed vector is not exactly in the image of A

fleet sail Apr 13, 2021, 1:32 PM

#

prime kraken semidefinite means it's rank deficient and has infinitely many solutions

hm are you sure that is the case?

#

if you are referring to (A^T)A having infinite solutions

#

then i dont think its always true, semi definite doesnt mean there is a 0 eigenvalue

prime kraken Apr 13, 2021, 1:35 PM

#

positive definiteness is directly related to the eigenvalues

#

it pretty much means all the eigenvalues are > 0

#

semidefiniteness, >= 0

#

so it might be rank defficient

#

you're minimizing something like Ax - b, right? you'd need b to be in the image of A for it to have a unique solution

fleet sail Apr 13, 2021, 1:37 PM

#

yea, so instead im solving the normal equation

prime kraken Apr 13, 2021, 1:37 PM

#

if A is not full column rank, A^T A is rank defficient

fleet sail Apr 13, 2021, 1:37 PM

#

yea, ok

#

but if A is full rank then A^T A must be PD?

prime kraken Apr 13, 2021, 1:38 PM

#

yes

fleet sail Apr 13, 2021, 1:38 PM

#

hm interesting

prime kraken Apr 13, 2021, 1:38 PM

#

full column rank

#

A could be a tall matrix

fleet sail Apr 13, 2021, 1:38 PM

#

because there exist A^T A positive semidefinite but invertible

#

or i mean

#

not A^T A

#

but

#

a general matrix

#

So like a general symmetric positive semidefinite matrix can be invertible

prime kraken Apr 13, 2021, 1:39 PM

#

depends what you mean by invertible

#

you can get an exact solution if b is in the image of A

#

if it isn't, you'll get some projection

fleet sail Apr 13, 2021, 1:40 PM

#

yea i mean as a square matrix totally invertible

prime kraken Apr 13, 2021, 1:41 PM

#

if it's invertible it's pos def

#

all eig vals > 0

#

(if we're still talking about A^T A)

fleet sail Apr 13, 2021, 1:42 PM

#

ok thanks, ill think about it

#

right i was just confusing myself

#

since if something was just symmetric, positive semidefinite but not positive definite then it must have a 0 eigenvalue

prime kraken Apr 13, 2021, 1:46 PM

#

it "may", not necessarily must

fleet sail Apr 13, 2021, 1:47 PM

#

why not, if it is symmetric

prime kraken Apr 13, 2021, 1:47 PM

#

but yeah, it won't have inflection points

#

well, it's >= 0

#

0 is also >= 0

#

😛

fleet sail Apr 13, 2021, 1:47 PM

#

but if it was all > 0 then it would be positive definite

prime kraken Apr 13, 2021, 1:47 PM

#

positive definite mats are also positive semidefinite

fleet sail Apr 13, 2021, 1:47 PM

#

ah i meant

#

positive semidefinite but not positive definite

prime kraken Apr 13, 2021, 1:48 PM

#

ah

#

then yes

fleet sail Apr 13, 2021, 1:48 PM

#

aight nice everything make sense then

#

thx

prime kraken Apr 13, 2021, 1:48 PM

#

k

bright palm Apr 13, 2021, 2:10 PM

#

This sounds silly and is maybe a #calculus question (if so LMK and I'll move it)

#

I'm having trouble wrapping my head around the 2n-1 degree exactness in gaussian quadrature

#

I can sort of get my head around that if we have n x's, we can easily get n degree accuracy

#

but why 2n-1

pine jettyBOT Apr 13, 2021, 2:12 PM

#

jan Niku

bright palm Apr 13, 2021, 2:16 PM

#

er, sorry

#

I've made a typo, that should say f(x) is a fifth degree polynomial

#

2n-1 degree exactness, and all

fleet sail Apr 13, 2021, 2:21 PM

#

bright palm er, sorry

the proof is quite involved but smart at the same time, gaussian quadrature uses legendre polynomials

bright palm Apr 13, 2021, 2:26 PM

#

hmm okay

#

maybe ill try to suffer through a proof

#

https://math.stackexchange.com/questions/3210951/proof-of-exactness-of-gaussian-laguerre-quadrature-integration

Mathematics Stack Exchange

Proof of exactness of Gaussian-Laguerre quadrature integration

The Laguerre polynomials $a_{0}(x), a_{1}(x), a_{2}(x), \dots$ form an orthogonal set on $[0, ∞)$ and satisfy:

$\int_{0}^{\infty} e^{-x} a_{i}(x) a_{j}(x) d x=0, \quad i \neq j$

The polynomial $a...

#

maybe just intuition is enough at this point though 😄

#

it sort of makes sense that 6 unknowns should give you 5 degrees of freedom

fleet sail Apr 13, 2021, 2:27 PM

#

ive got a "proof" but it doesn't prove how to find the orthogonal polynomial

#

but other than that it is a proof

#

that you can reference

bright palm Apr 13, 2021, 2:27 PM

#

If it's easy to link, sure

#

I may ask my teacher exactly what level of comprehension I should be aiming at

fleet sail Apr 13, 2021, 2:28 PM

#

📎 Lecture_24_Slides.pdf

bright palm Apr 13, 2021, 2:28 PM

#

thank you catblush

fleet sail Apr 13, 2021, 2:28 PM

#

start from pg5

bright palm Apr 13, 2021, 2:31 PM

#

yea, we are tasked with this in the homework, which is not so bad

#

the mechanical portions arent too painful, the exactness and the transformation are conceptually strange to me though

fleet sail Apr 13, 2021, 2:32 PM

#

only that the system is not linear wew

#

so it gets hard really quickly

#

wait i c hold up

bright palm Apr 13, 2021, 2:37 PM

#

we just use newtons for systems in our class

#

im sure that means this stuff is curated so that we get nice answers using this method

#

but it works for now

wide spear Apr 13, 2021, 3:08 PM

#

catThink

bright palm Apr 13, 2021, 3:30 PM

#

would someone be willing to help me with a transform blobsweat

#

I'm looking at this problem

#

#

n=3 gaussian quadrature coefficients over -1 to 1

#

so if id like to integrate over 0 to 1 instead, each weight should become 1/2 of its original value?

wide spear Apr 13, 2021, 3:32 PM

#

I think that these weights are for 0 to 1

#

For -1 to 1 you have x_i at -sqrt(3/5), 0, and sqrt(3/5)

bright palm Apr 13, 2021, 3:35 PM

#

oh 🤔

#

maybe I can ask for a different problem then

wide spear Apr 13, 2021, 3:35 PM

#

stare

bright palm Apr 13, 2021, 3:35 PM

#

well wait

#

so if we moved from -1 to 1

wide spear Apr 13, 2021, 3:35 PM

#

Ok

bright palm Apr 13, 2021, 3:36 PM

#

we begin with $\pm \sqrt{ \sfrac{3}{5}}, 0$

#

as weights

pine jettyBOT Apr 13, 2021, 3:36 PM

#

jan Niku

wide spear Apr 13, 2021, 3:36 PM

#

These aren't the weights

#

These are the points

#

The weight at 0 is 8/9 and the weight at the other two points is 5/9

bright palm Apr 13, 2021, 3:37 PM

#

okay

#

so as we move from -1 to 1

#

each weight should become 1/2 of its original value

#

as (b-a)/2 = 1/2

#

so 4/9 and then endpoints of 5/18

wide spear Apr 13, 2021, 3:38 PM

#

Yes when you move to the interval [0,1]

bright palm Apr 13, 2021, 3:50 PM

#

sorry im at work

#

but this doesnt seem complicated

#

i think i just lack confidence

#

thanks for the help

wide spear Apr 13, 2021, 3:51 PM

#

catThumbsUp

fleet sail Apr 14, 2021, 2:05 AM

#

Here's my current proof that the line search directions of conjugate gradient are Q_A = A^T A conjugate

#

Screen_Shot_2021-04-14_at_10.05.25_AM.png

#

I believe a problem is that I assumed a_i ≠ 0

#

any way to get around it our justify if that's fine? btw a_i is this

#

Screen_Shot_2021-04-14_at_10.07.35_AM.png

#

it's the optimal scaling value for each line search in the Q_A conjugate direction

#

so the only thing I realize is if a_k = 0 then r^(k) and d^(k) are orthogonal and also means there's no improvement in that particular iteration

#

which I didn't say clearly in the proof, I just said "which means least squares is solved" which isn't the case

wide spear Apr 14, 2021, 2:12 AM

#

catThink

fleet sail Apr 14, 2021, 3:28 AM

#

wide spear <:catThink:519997422194393108>

btw do u know a faster way to do $v^\top A v$, where A is an inner product, than just to do the standard (dumb) way $v^\top(Av)$

pine jettyBOT Apr 14, 2021, 3:28 AM

#

Anticipation

wide spear Apr 14, 2021, 3:29 AM

#

Implemented practically, calling matrix-vector multiplies will probably be the fastest

#

Because you'll be calling BLAS routines

#

Of course, if you know more about A, you might be able to do something

#

Doing two mat-vec mult is O(n^2)

#

I think 4n^2?

#

Doing anything like decomposing A will be more expensive

fleet sail Apr 14, 2021, 3:30 AM

#

ic

#

yea O(n^2)

#

so its like it makes no difference if A induces an inner product or not sadcat

#

tfw u want to solve a linear system exactly under O(n^3) realshit looks like law of nature dont let u

wide spear Apr 14, 2021, 3:33 AM

#

Well if A is dense with no structure, there is no hope

#

But

#

This is why I'm giving a talk!

fleet sail Apr 14, 2021, 3:34 AM

#

nice

brave crypt Apr 14, 2021, 3:37 AM

#

But eg if you are doing this multiple times for many v, then maybe there is a trick like a decomposition. Or even if A is changing, if the decomposition plays nice with the updates, then maybe it is still good

#

(I guess A is an inner product means it is positive definite?)

wide spear Apr 14, 2021, 3:39 AM

#

Yes

#

I think A is constant for CG

#

You would need to do v^TAv at least n times for a decomposition to make sense

#

And the point of CG is to use less steps than that

brave crypt Apr 14, 2021, 3:40 AM

#

Oh

wide spear Apr 14, 2021, 3:42 AM

#

CG, as a member of the Krylov Subspace family of methods, will give the exact solution in n steps

#

But this is equivalent to doing LU directly

#

However, it will converge must faster to a solution

#

So you can get convergence within machine epsilon well before this point

prime kraken Apr 14, 2021, 3:46 AM

#

will you cover sparse regularization in your talk?

wide spear Apr 14, 2021, 3:47 AM

#

I'll cover some sparse stuff

#

Probably not regularization though

#

It's only an hour

#

And there's a lot of stuff to potentially cover

prime kraken Apr 14, 2021, 3:49 AM

#

fair enough, yeah

fleet sail Apr 14, 2021, 1:27 PM

#

fleet sail which I didn't say clearly in the proof, I just said "which means least squares ...

Think I solved this, 2 things: the proof is really nice, its induction with 3 statements and you use all of them to prove the next step and second is alpha = 0 will never happen because that would mean r = 0 and we would be done in the previous step

normal lava Apr 14, 2021, 2:50 PM

#

#

can someone help me with this optimzation problem

prime kraken Apr 14, 2021, 3:01 PM

#

if all you have to do is verify, you can multiply the two expressions and show you get an identity matrix

normal lava Apr 14, 2021, 3:04 PM

#

Just multiply B(k+1) and B(k+1)^(-1)?

wide spear Apr 14, 2021, 3:09 PM

#

Yeah

normal lava Apr 14, 2021, 3:11 PM

#

will the result be so complicated?

wide spear Apr 14, 2021, 3:14 PM

#

It will be reasonably complicated

#

But it should simplify to the identity matrix

normal lava Apr 14, 2021, 3:18 PM

#

I try the multiplication but it isnt look like an identity matrix

wide spear Apr 14, 2021, 3:19 PM

#

Well what do you have?

normal lava Apr 14, 2021, 3:20 PM

#

a long expression, maybe i dont know how to simplify it

#

would you mind to show me how to do it?

#

Is there any tactics on doing the multiplication?

brave crypt Apr 15, 2021, 2:03 AM

#

Maybe you can do it as two rank 1 updates, using the woodbury formula. But otherwise doing the multiplication carefully is the only "tactic" I can think of

fleet sail Apr 15, 2021, 3:50 AM

#

one thing im consered with conj grad for least square is its slow in the end

#

concerned*

#

cuz u always have to compute A^T A

wide spear Apr 15, 2021, 3:50 AM

#

Well yeah

#

So you use something like GMRES for least squares

#

And not CG

#

You never want to solve the normal equations

fleet sail Apr 15, 2021, 3:51 AM

#

ic our prof recommended conj grad, it seems like the course is not going in depth enough

#

but assuming we already solved A^T A then its good ig

#

cuz that multiplication takes order n^2 * m i think which is just bad

wide spear Apr 15, 2021, 3:54 AM

#

Another thing is that it doubles the condition number so iterative methods take twice as long to converge (roughly)

fleet sail Apr 15, 2021, 3:56 AM

#

wide spear Another thing is that it doubles the condition number so iterative methods take ...

which thing doubles the condition num?

wide spear Apr 15, 2021, 3:57 AM

#

Turning A into A^TA

fleet sail Apr 15, 2021, 3:57 AM

#

oh ok thx

prime kraken Apr 15, 2021, 4:27 AM

#

what definition of condition number are you using?

wide spear Apr 15, 2021, 4:28 AM

#

Like

#

The condition number of a matrix

prime kraken Apr 15, 2021, 4:29 AM

#

doesn't A^T A square the singular values?

#

not multiply by 2

wide spear Apr 15, 2021, 4:29 AM

#

Yeah it squares them

#

But iterative methods have runtime proportional to log condition number

#

Or something

prime kraken Apr 15, 2021, 4:30 AM

#

ah yes

wide spear Apr 15, 2021, 4:30 AM

#

So twice as long

prime kraken Apr 15, 2021, 4:30 AM

#

yes yes

wide spear Apr 15, 2021, 4:30 AM

#

Yes

prime kraken Apr 15, 2021, 4:30 AM

#

yes

wide spear Apr 15, 2021, 4:30 AM

#

Yes

random hornet Apr 15, 2021, 6:27 AM

#

Yes

fervent ermine Apr 15, 2021, 10:23 AM

#

Yes

brave crypt Apr 15, 2021, 10:24 AM

#

YES!!!

azure shuttle Apr 15, 2021, 11:49 AM

#

yes

dark sinew Apr 15, 2021, 1:13 PM

#

Yes?

prime kraken Apr 15, 2021, 1:55 PM

#

https://tenor.com/view/yes-jotaro-kujo-jojos-gif-7297252

Tenor

wide spear Apr 15, 2021, 2:04 PM

#

stare

fleet sail Apr 15, 2021, 2:53 PM

#

monkagigagun

uneven agate Apr 15, 2021, 5:25 PM

#

Good afternoon. I've been having a rough time with this question. Let $a,b,c$ be complex numbers, $a\neq b,c\neq 0$, and let
\begin{align*}
w=f(z):=\frac{1}{c}(e^{az}-e^{bz}).
\end{align*}
Show that the inverse function near $w=0$ is represented by
\begin{align*}
z=f^{[-1]}(w)=\frac{1}{d}\sum_{n=1}^\infty \frac{(-1)^{n-1}c^n}{n!}\left(\frac{nb}{d}+1\right)_{n-1}w^n,
\end{align*}
where $d:=a-b$. Derive as special cases the representations for the inverses of the functions
\begin{align*}
w=\sin z,\quad w=e^z-1,\quad w=e^{\alpha z}\sin \beta z,
\end{align*}
and, as the limiting case $a=b+c$, $c\to 0$, $w=ze^{bz}$. [To compute residues use formal integration by parts; see Problem $1.8.6$.]

pine jettyBOT Apr 15, 2021, 5:25 PM

#

TheRedLotus

wide spear Apr 15, 2021, 5:26 PM

#

#advanced-analysis

uneven agate Apr 15, 2021, 5:26 PM

#

My bad. Didn't see that channel.

wooden tendon Apr 16, 2021, 1:13 PM

#

Hello,
I would like to plot a damped sinusoidal frequency spectrum in matlab
here is the equation of this damped sinusoidal
\begin{align}
$e^{-2t} . sin(4\pi t)$
\end{align}
the fourier transform of such signal would be that if i'm not wrong
\begin{align}
$\frac{4\pi}{(2+i2\pi f)^2 + (4\pi)^2}$
\end{align}
but i don't know what to do with that

pine jettyBOT Apr 16, 2021, 1:13 PM

#

DawnUltra

wooden tendon Apr 16, 2021, 1:28 PM

#

here is my matlab code

function [] = damped_sin()
x = 0:1:5;
a = zeros(1,numel(x));
a = (4*pi)/(square(2+1i*2*pi*x) + (2*pi*2)^(2))
plot(x,a);
title("damped sinusoidal spectrum")
xlabel("frequency")
ylabel("amplitude")

wide spear Apr 16, 2021, 2:44 PM

#

#computing-software

#

Ask there

wooden tendon Apr 16, 2021, 3:19 PM

#

ok

rotund terrace Apr 17, 2021, 4:37 PM

#

can somebody eli5 me Forrest Tomlin update from simplex method?

brave crypt Apr 19, 2021, 1:37 PM

#

rotund terrace can somebody eli5 me Forrest Tomlin update from simplex method?

Can give you simplex method matlab code
Not sure abt Forrest Temlin

brave crypt Apr 19, 2021, 1:38 PM

#

wooden tendon here is my matlab code ```Matlab function [] = damped_sin() x = 0:1:5; a = zeros...

Use 2d plots for complex plane, make your function in ai+b form
then a=x b=y

wooden tendon Apr 19, 2021, 1:40 PM

#

ok

#

dvanapasa do you think my fourier transform is correct by the way ?

brave crypt Apr 19, 2021, 1:41 PM

#

wooden tendon dvanapasa do you think my fourier transform is correct by the way ?

Can't)
I did it 7 years ago
can't remember the formulas

wooden tendon Apr 19, 2021, 2:05 PM

#

no problem

turbid jay Apr 20, 2021, 8:18 AM

#

hey, how do I prove that this function (log sum exp) has a minimum? I know that it is smooth and convex (not strongly convex). I found it here: https://link.springer.com/content/pdf/10.1007/s10208-013-9150-3.pdf p. 12 but they just take the existence of a minimum for granted

brave crypt Apr 20, 2021, 8:21 AM

#

I think there is a common technique of proving that a function is convex and... "Forcing", maybe it was called? The property is that f(x)->infty as |x|->infty

#

Ah sorry, maybe this doesn't work? Yeah, I don't think this works, hm

turbid jay Apr 20, 2021, 8:22 AM

#

I've already proven convexity, but this tells me nothing about the existence of a minimizer

brave crypt Apr 20, 2021, 8:23 AM

#

What conditions do you have on the ai? (Nothing?)

turbid jay Apr 20, 2021, 8:24 AM

#

none, they are the rows of a matrix

#

let me double check this

sage vapor Apr 20, 2021, 8:25 AM

#

what if m=1 ?

turbid jay Apr 20, 2021, 8:25 AM

#

no conditions on A and b

brave crypt Apr 20, 2021, 8:26 AM

#

sage vapor what if m=1 ?

Ah yeah, something is weird, huh

#

I think intuitively if the ai "point in many directions" this will have a minimum (in R this would just be pointing in the positive and negative direction)

sage vapor Apr 20, 2021, 8:28 AM

#

I think you need m > number of dimensions

brave crypt Apr 20, 2021, 8:31 AM

#

Ah, the property I was thinking of is "coercive"

turbid jay Apr 20, 2021, 8:31 AM

#

maybe it doesn't always have an optimum...

As it is smooth, we expect the region around the optimum to be well approximated by a quadratic (assuming the optimum exists)

brave crypt Apr 20, 2021, 8:31 AM

#

https://en.m.wikipedia.org/wiki/Coercive_function

Coercive function

In mathematics, a coercive function is a function that "grows rapidly" at the extremes of the space on which it is defined. Depending on the context
different exact definitions of this idea are in use.

turbid jay Apr 20, 2021, 8:31 AM

#

ah yes, I read about coercive functions but don't know too much about it

#

maybe it's a good time now to look more into this

#

x \in R^n

#

rho is just a scalar parameter

#

A \in R^{mxd} and b \in R^m

brave crypt Apr 20, 2021, 8:33 AM

#

So yeah, I think ai pointing in many directions should imply this function is coercive => minimum exists. But maybe there are weaker conditions under which a minimum ecists

sage vapor Apr 20, 2021, 8:33 AM

#

no that should be an equivalence here

#

if you find a direction where all the scalar products are negative well you can get as close to -infinity as you want

brave crypt Apr 20, 2021, 8:41 AM

#

Yeah I think it sounds right, although I'm not sure eg how many ai are necessary to do this

prime kraken Apr 20, 2021, 8:56 AM

#

doesn't convexity on its own imply the existence of a minimizer?

#

just not unique

brave crypt Apr 20, 2021, 8:57 AM

#

But linear functions are convex, exponential is convex, etc

prime kraken Apr 20, 2021, 8:58 AM

#

and they have a minimum

brave crypt Apr 20, 2021, 8:58 AM

#

I guess eg in ML world probably all loss functions are coercive, so there is an implication convex implies minimum exists

brave crypt Apr 20, 2021, 8:58 AM

#

prime kraken and they have a minimum

But we are working over the whole domain R/Rn, not eg over a compact domain

prime kraken Apr 20, 2021, 9:00 AM

#

you write R/Rn as a quotient or just to mean R^n

brave crypt Apr 20, 2021, 9:00 AM

#

Just as in R or Rn (really n=1 case), no quotient going on

prime kraken Apr 20, 2021, 9:01 AM

#

then yeah, convexity implies there is a minimum

brave crypt Apr 20, 2021, 9:01 AM

#

I think there is just a miscommunication, linear function obviously goes down to -infinity

prime kraken Apr 20, 2021, 9:01 AM

#

because the function is strictly non-decreasing

#

it won't be unique in general, but they will all be gathered in a single region

brave crypt Apr 20, 2021, 9:02 AM

#

You are saying: f is a function from Rn to R, then f convex implies f has a minimum?

prime kraken Apr 20, 2021, 9:02 AM

#

yeah

#

or rather, than any local minimum will also be global

brave crypt Apr 20, 2021, 9:03 AM

#

Yes, I totally agree with local implies global

prime kraken Apr 20, 2021, 9:03 AM

#

aside from that, the minimum would be at a boundary if not within the domain

#

you would have to either find where the gradient is a 0 vector, because that is necessary and sufficient due to what we just said, or find the optimum along the boundary of the domain if the gradient does not become 0

turbid jay Apr 20, 2021, 9:05 AM

#

prime kraken doesn't convexity on its own imply the existence of a minimizer?

no, for that you need strong convexity

brave crypt Apr 20, 2021, 9:07 AM

#

I was gonna say "one of us is on drugs 🙂 " but I am already drinking

prime kraken Apr 20, 2021, 9:07 AM

#

checking again, strict convexity does not imply it either

#

😛

turbid jay Apr 20, 2021, 9:07 AM

#

there is strong and there is strict convex. Let me dig up the definitions

brave crypt Apr 20, 2021, 9:07 AM

#

Strong convexity implies it is lower bounded by a quadratic right?

prime kraken Apr 20, 2021, 9:08 AM

#

i might've mixed them up

turbid jay Apr 20, 2021, 9:08 AM

#

strong convexity bounds the function from below

#

#

but I don't know if this function is strong convex. Maybe that's a starting point 🙂

brave crypt Apr 20, 2021, 9:11 AM

#

But we already know it is not strong convex without some conditions on ai

prime kraken Apr 20, 2021, 9:12 AM

#

a quick google-fu says you can use strict convexity and the function being coercive to show the minimum exists

brave crypt Apr 20, 2021, 9:13 AM

#

Yeah, and coercive should mean that ai point in "many" directions, ie every vector in Rn has a positive dot product with some ai

turbid jay Apr 20, 2021, 9:15 AM

#

prime kraken a quick google-fu says you can use strict convexity and the function being coerc...

but I think that it's not strictly convex

#

I have to read a bit more into coercive functions, I think that is the correct way to show it

sonic shuttle Apr 20, 2021, 11:56 AM

#

turbid jay

where is this from?

turbid jay Apr 20, 2021, 12:22 PM

#

sonic shuttle where is this from?

a script from my lecture

cobalt lintel Apr 21, 2021, 10:37 AM

#

Hello! So I am trying to write a program that draws my teacher using epicycles as a way of saying thanks. I have noticed that a lot of programs use the discrete fourier transform for this, however, I am not that comfortable with this method so I am here to ask a few questions. So let's say that I somehow get a lot coordinates of a simple picture that I want to use. Then I should use the DFT to get the complex points using the formula above. But how do I proceed from there? How do I calculate the radius of each circle? How fast should they spin?

cobalt lintel Apr 21, 2021, 10:57 AM

#

Okay so from this I guess that the frequency of each circle would be k/N and that the amplitude (radius) would be 1/N*sqrt(Re^2+Im^2), is this correct?

cobalt lintel Apr 21, 2021, 11:20 AM

#

So this is my guess. Let's say that I have the first circle, then I should add another circle that spins around the first circles circumference. The middle point of that circle should be x = r cos(2pi k/Nt), y=rsin(2pi k/Nt) where t is the time. But how much should t increase by for each loop? Can you just pick that an arbitrary number, say 0.5, that the time should increase by? I.e t+=0.5 each loop?

wide spear Apr 21, 2021, 2:09 PM

#

The DFT is not appropriate for this

#

Instead you want to be taking a Fourier series

cobalt lintel Apr 21, 2021, 2:17 PM

#

wide spear The DFT is not appropriate for this

But doesn't the fourier series only work for periodic functions?

#

why?

wide spear Apr 21, 2021, 2:18 PM

#

So the way they do the fancy animations is they make the image into a periodic function in polar coordinates

cobalt lintel Apr 21, 2021, 2:18 PM

#

oh oops...

#

But a lot of programs that I've seen use the DFT on both the x-coordinates and the y-coordinates

#

So what is the difference between DFT and fourier series?

#

Okay, thanks for the info! But can't I just a get a simple outline of by teachers face and somehow get the x and y coordinates of all those points and perform the DFT on both the x and y coordinates separately? Wouldn't the complex numbers generated by the DFT provide all the necessary info for the circles etc.?

#

Yeah true... But I am not that confident with polar coordinates

#

Oh yeah, that's true. I will definitely give this method a try! Thank you so much for the information!

#

What is interpolate? Taking the mean value?

fleet sail Apr 22, 2021, 2:57 AM

#

i guess theres a typo here?

Screen_Shot_2021-04-22_at_10.57.28_AM.png

#

heres original matrix, i thought to clear out the 0

Screen_Shot_2021-04-22_at_10.48.57_AM.png

#

you need to use b1 rather than b2

#

and would the subsequent givens rotation just be Gk = G(k,k+1,theta)

wide spear Apr 22, 2021, 2:59 AM

#

They are doing a Givens rotation for the upper left 2x2 block

#

That's why the b_1 in position A_12 becomes q_1

#

I think

#

Hmmmm

fleet sail Apr 22, 2021, 3:00 AM

#

it says choose theta satisfying that thing

#

but i thought it should be b1

#

there

#

instead of b2

wide spear Apr 22, 2021, 3:00 AM

#

That's true

#

Hmmmmm

#

Well, do you understand the point of Givens rotations

fleet sail Apr 22, 2021, 3:04 AM

#

yea

#

i think i got it though theres 2 typos

#

like I know its to triangularize and here its O(n) time apparently

#

the one thing i dont quite sure

#

is when you get (Q^T)A = R after performing the sequence of givens right

#

then A = QR, but why is it that when you do RQ = B, then B is again tridiagonal, I guess just prove it inductively by looking at each givens rot and look at the things elementwise?

fleet sail Apr 22, 2021, 4:29 AM

#

ok i figured out the implementation

#

i just needhelp to prove QRQ^T is also tridiagonal

fleet sail Apr 22, 2021, 9:14 AM

#

Ok i also figured out theoretically why

#

its cuz B = RQ = Q^TAQ is symmetric and u can show that when u apply the sequence of givens rotation on R from the right, the element below subdiagonal below main is maintained 0 because A is tridiagonal

#

damn QR diagonalization is epic

naive creek Apr 22, 2021, 12:48 PM

#

Hello, I have a question about preconditioning in numerical linear algebra. On Wikipedia and other sites I found that solving a system of linear equations Ax=b where the condition number of A is low can lead to a fast rate of convergence for an iterative method of solving Ax=b. However, I can't seem to find or understand why a low condition number leads to a fast convergence. Does anybody have good source which explains this or does anyone mind to explain themselves? Thanks in advance!

naive creek Apr 22, 2021, 1:05 PM

#

Thanks for the link! unfortunately I'm not really able find the answer with the information. The explanation also seems quiet advanced and the question connected with the course is an introductory numerical maths course. So I think there could be an 'easier' explanation for this.

naive creek Apr 22, 2021, 1:21 PM

#

Mmm yeah sorry advanced isn't quite right to say. tbh I don't really understand the answer and I think that this is not the answer they would expect in my situation since the explanation is not very related to the content of the course. I was more thinking about finding a formula or so that includes the condition number of the matrix and a measurement for the speed or convergence.

#

Not really, we should be able to find the answer for this question with the content that we saw. (But I'm not really able to find it with the course material I have.)

prime kraken Apr 22, 2021, 1:42 PM

#

you can interpret it as being related to the lipschitz constant of the gradient of the function $\frac{1}{2}\Vert Ax - B \Vert_2^2$

pine jettyBOT Apr 22, 2021, 1:42 PM

#

Edd

prime kraken Apr 22, 2021, 1:44 PM

#

if all of the singular values of A are the same, the gradient is equally smooth in all directions. if not, then the condition number increases and some dimensions change more slowly than others, but you have to account for the worst case scenario to guarantee convergence

#

this usually results in iterative updates that change some dimensions very slowly

naive creek Apr 22, 2021, 1:46 PM

#

Thank you! I think I'm getting it. The formula is also very helpful 🙂

mental grail Apr 22, 2021, 2:28 PM

#

I'm trying to decide which courses to take next semester, and I can't decide whether two in particular would be helpful. The first course gives an introduction to measure theory with a slight focus on its connection to probability theory, and the second course introduces functional analysis. I'm interested in machine learning and its foundational fields, so I'm curious whether these courses would be useful there.

#

(I hope this is the right channel, otherwise please point me to the correct one)

#

I am specializing in robotics/mechatronics/control theory, so if either course is useful there, that'd be great too

wide spear Apr 22, 2021, 2:34 PM

#

Go ask in #math-discussion

mental grail Apr 22, 2021, 2:37 PM

#

Thanks!

tall solar Apr 23, 2021, 1:57 AM

#

Hey svd question
say you have X=USV^T

I'm reading somewhere that the "orthogonal projection onto the column space of X is UU^T

What does that even mean?? UU^T should be an identity ??

wide spear Apr 23, 2021, 2:05 AM

#

Yes, UU^T is the identity

orchid sequoia Apr 23, 2021, 2:07 AM

#

is it reduced svd?

tall solar Apr 23, 2021, 2:12 AM

#

I don't think it being reduced has to do with anything.

It doesn't seem like projecting onto the column space changes anything wth

brave crypt Apr 23, 2021, 2:15 AM

#

I recall there are multiple SVDs, and some of them are such that UU^T are identity and some are not, or something like that

orchid sequoia Apr 23, 2021, 2:15 AM

#

yes, indeed

#

ig in this case UU^T could be not identity, as X can be singular

fleet sail Apr 23, 2021, 2:17 AM

#

i think its always identity?

#

cuz AAT is diagonalizable

brave crypt Apr 23, 2021, 2:17 AM

#

I guess it sounds sort of right, if X is surjective then probably this means good shit happens, and UU^T is the identity, maybe, and so is orthogonal projection

orchid sequoia Apr 23, 2021, 2:17 AM

#

U^T U is always identity

orchid sequoia Apr 23, 2021, 2:18 AM

#

fleet sail cuz AAT is diagonalizable

depends on the definition of the SVD method

tall solar Apr 23, 2021, 2:18 AM

#

See I wanna take a matrix X1 and another X2

Then find the closest matrix to X2 that has the same row and column space as X1

fleet sail Apr 23, 2021, 2:18 AM

#

right i think U and V i was thinking are square

brave crypt Apr 23, 2021, 2:19 AM

#

I remember being sad because I was working on an exercise that talks about SVD, but they must have meant a different SVD than the normal one because the shit they were saying did not make sense

orchid sequoia Apr 23, 2021, 2:19 AM

#

lol

tall solar Apr 23, 2021, 2:22 AM

#

I saw another formula for the "projection onto the column space of X"
That was like $ X*(X^{T} *X)^{-1} X^{T}$
But if you use the svd X=USV^T
It all reduces to nothing no matter what

Weird stuff

tall solar Apr 23, 2021, 4:31 AM

#

Nvm it works out I'm insane. You guys were right it works

prime kraken Apr 23, 2021, 6:31 AM

#

this was already a while ago, but for completeness, U U^T is the identity if U is square, since then U contains the basis of the column space and the left null space. normally though, you ignore the singular vectors corresponding to the 0 singular values since you want to, as people said above, project onto the column space. in this case, U U^T is not an identity. this latter one is called "economy-size SVD"

#

so it's usually a good idea to say U = [U_c U_l], showing explicitly which part is a basis for the column space and which part for the left null space

#

then U U^H is always an identity, and U_c U_c^H (economy or reduced svd) is an identity if the matrix is full rank

brave crypt Apr 23, 2021, 4:22 PM

#

idk if this is the channel but

#

Imagine i have an image, and i have a mask for it

#

what is the operation i need to do to leave the resulting background as white?

#

Like this

#

https://gyazo.com/cc5b4dc839c82435935fd92898ce2ef9

#

But with the background white

#

the masked image is the third bird

wide spear Apr 23, 2021, 4:25 PM

#

#computing-software

brave crypt Apr 23, 2021, 4:25 PM

#

ty

tall solar Apr 23, 2021, 7:31 PM

#

prime kraken then U U^H is always an identity, and U_c U_c^H (economy or reduced svd) is an i...

Yes! I'm a bit embarrassed I didn't realize this at first lol.

I was curious about projecting svd of one natural image onto another. Then I realized what you're saying now. In a natural image all the sv's will typically be nonzero (or error or whatever reason). I was upset I kept getting the same image but then I realized since they were full rank they have exactly the same column and row space.

brave crypt Apr 23, 2021, 7:31 PM

#

for i in range(N):
  for j in range(M):
    if not mask[i*M+j]:
      image[i*M+j] = 0x88DDDA

:tinkTonk:
Holy shit that was painful to type on phone lmao

wide spear Apr 23, 2021, 7:32 PM

#

think2

tall solar Apr 23, 2021, 7:33 PM

#

Btw
I ended up truncating the singular values of each natural image then performing the projection. But when I did the projection between a dog and a cat photo nothing particularly cool happened.

I wanted to see a cat dog :(

wide spear Apr 23, 2021, 7:33 PM

#

Conclusion cat=dog

prime kraken Apr 23, 2021, 7:33 PM

#

the basis was probably similar

tall solar Apr 23, 2021, 7:34 PM

#

Yeah if you keep like 5 singular values you might get lucky with certain photos and keep some color of the other

prime kraken Apr 23, 2021, 7:34 PM

#

you'd have to do a very low rank approx to see anything neat, maybe

#

yea

#

if youd had several examples, you can do something more similar to that

#

like a basis for several dog images

#

this is more like the so-called eigenfaces alg, which is pca, which is svd in a trench coat

tall solar Apr 23, 2021, 7:36 PM

#

Yes I was thinking the same thing! I have found a dataset utkFace of a bunch of centered face images.

I wonder if I can turn a man into a lady

prime kraken Apr 23, 2021, 7:38 PM

#

that sort of classifier should work something like doing a projection and then measuring the error

#

that projection should be most similar component of the original image to those in the basis you found

#

very roughly, anyway. in ML, the nonlinear activation functions let the result be more sophisticated

tall solar Apr 23, 2021, 8:09 PM

#

Nice yeah machine learning is cool lol.

Before ml I wanted to try some tensor factorization on color images. I read about this cool factorization called the t svd that I'm really vibing with.

prime kraken Apr 23, 2021, 8:10 PM

#

i only know HOSVD and PARAFAC

#

never heard of T SVD before

#

looks interesting

wide spear Apr 23, 2021, 8:17 PM

#

think2

tall solar Apr 23, 2021, 8:19 PM

#

what's cool about it is that it essentially works by applying svd in the Fourier domain.

There's a block matrix associated with a 3d tensor called the block circulant that has the same spectra than the Fourier modes

brave crypt Apr 23, 2021, 8:19 PM

#

HO-SVD tinkTonk

wide spear Apr 23, 2021, 8:20 PM

#

Ah yes

#

Circulant matrices

prime kraken Apr 23, 2021, 8:20 PM

#

i'd have to see how the circulant is constructed, but yeah

#

all circulant mats are diagonalized by fourier matrices

wide spear Apr 23, 2021, 11:00 PM

#

Something something no screenshot

tall solar Apr 23, 2021, 11:02 PM

#

Lol got you

wide spear Apr 24, 2021, 1:06 AM

#

@brave crypt was there a specific section you had questions about

brave crypt Apr 24, 2021, 1:07 AM

#

Hello

#

Please wait.

#

These two equations:

#

wide spear Apr 24, 2021, 1:09 AM

#

Ok so let's look at the first one first

brave crypt Apr 24, 2021, 1:09 AM

#

Sure 🙂

#

This is the paper, if you don't want to switch the channels: https://arxiv.org/pdf/1905.12120.pdf 😀

wide spear Apr 24, 2021, 1:11 AM

#

Ok

#

So I'm reading this

#

And I'm not entirely sure if X and W are matrices or vectors

#

Which is clearly an oversight on the part of the authors

#

Some more experienced with ML may be able to deduce this

brave crypt Apr 24, 2021, 1:12 AM

#

I have a book, let me see if I can find annotation for it.

wide spear Apr 24, 2021, 1:13 AM

#

Honestly

#

They don't even share a github repo

brave crypt Apr 24, 2021, 1:13 AM

#

I am sorry.

wide spear Apr 24, 2021, 1:13 AM

#

I mean

#

It's not your fault

brave crypt Apr 24, 2021, 1:14 AM

#

Yes, but still.

wide spear Apr 24, 2021, 1:14 AM

#

Anyways

brave crypt Apr 24, 2021, 1:14 AM

#

W is Weight Matrix.

wide spear Apr 24, 2021, 1:14 AM

#

$\sum_{j=1}X[i+jr]W[j]$ is a convolution

pine jettyBOT Apr 24, 2021, 1:14 AM

#

Angetenar

brave crypt Apr 24, 2021, 1:14 AM

#

x: Input Vector

wide spear Apr 24, 2021, 1:14 AM

#

Lower case x or capital X

#

I think that X here is also a matrix

brave crypt Apr 24, 2021, 1:15 AM

#

Then you must be right.

wide spear Apr 24, 2021, 1:18 AM

#

This paper is so bad

#

They don't define anything

#

Anyways

#

Do you understand how convolutions work

brave crypt Apr 24, 2021, 1:19 AM

#

If you can explain me briefly then it would be really helpful.

wide spear Apr 24, 2021, 1:19 AM

#

Ok

brave crypt Apr 24, 2021, 1:19 AM

#

Thanks 🙂

wide spear Apr 24, 2021, 1:19 AM

#

Let's consider the 3 by 3 matrix $\begin{bmatrix}1&2&3\4&5&6\7&8&9\end{bmatrix}$ and the 2 by 2 kernel $\begin{bmatrix}-1&1\-2&2\end{bmatrix}$

pine jettyBOT Apr 24, 2021, 1:19 AM

#

Angetenar

wide spear Apr 24, 2021, 1:21 AM

#

Then, the convolution of this would be $\begin{bmatrix}-1\cross1+1\cross2-2\cross4+2\cross5&-1\cross2+1\cross3-2\cross5+2\cross6\-1\cross4+1\cross5-2\cross7+2\cross8&-1\cross5+1\cross6-2\cross8+2\cross9\end{bmatrix}$

pine jettyBOT Apr 24, 2021, 1:21 AM

#

Angetenar

wide spear Apr 24, 2021, 1:21 AM

#

Do you see what has happened

#

We take the kernel (also called a filter) and we slide it through the matrix

#

We start at the top left

#

And we take all the elementwise products

#

And then add them all together

#

And then we slide the filter over by 1

#

And then we reach the end of the row so we move it down 1 and start over from the left

#

And then we slide over by 1 again

#

And then we're done

#

So for a filter

#

We also have two strides, sigma_x and sigma_y, which determine how much you slide in the x and y directions

#

You also have a dilation term

#

Which determines how big the kernel is when mapped onto the matrix

#

In this example I worked out, the dilation would be 1

#

However, if the dilation were 2, for example, we would have the result of the convolution as $\begin{bmatrix}-1\cross1+1\cross3-2\cross7+2\cross9\end{bmatrix}$

pine jettyBOT Apr 24, 2021, 1:24 AM

#

Angetenar

wide spear Apr 24, 2021, 1:25 AM

#

Does this make sense

brave crypt Apr 24, 2021, 1:25 AM

#

Yes, we will skip it.

#

2 positions 🙂

wide spear Apr 24, 2021, 1:26 AM

#

In essence, the kernel $\begin{bmatrix}-1&1\-2&2\end{bmatrix}$ with dilation 2 is equivalent to $\begin{bmatrix}-1&0&1\0&0&0\-2&0&2\end{bmatrix}$

#

Anyways

#

Ok

pine jettyBOT Apr 24, 2021, 1:26 AM

#

Angetenar

wide spear Apr 24, 2021, 1:26 AM

#

Ok

#

So now that we know what a convolution is

brave crypt Apr 24, 2021, 1:26 AM

#

Yes 👍

#

Thank you for explaining that.

#

😀

wide spear Apr 24, 2021, 1:27 AM

#

$\sum_{j=1}X[i+jr]W[j]$ computes a convolution for a single output $Y(i)$

pine jettyBOT Apr 24, 2021, 1:27 AM

#

Angetenar

wide spear Apr 24, 2021, 1:27 AM

#

Y is a matrix I think

#

Their notation sucks

#

Anyways

#

Once we compute this

#

We apply ReLU

#

Which is $Re\qty(\sum_jX[i+jr]W[j])=\max\left{0,\sum_jX[i+jr]W[j]\right}$

pine jettyBOT Apr 24, 2021, 1:28 AM

#

Angetenar

wide spear Apr 24, 2021, 1:28 AM

#

Ok

#

You know what relu is right

#

If something is positive, it stays the same

brave crypt Apr 24, 2021, 1:29 AM

#

Yes, max(0, N)

wide spear Apr 24, 2021, 1:29 AM

#

And if something is negative, it becomes 0

#

Yep

#

Ok

#

After we apply ReLU, we batch normalize

#

Do you know what batch normalize means

brave crypt Apr 24, 2021, 1:29 AM

#

Where I can take material to study about applied-computational-math?

#

Nope. I am sorry.

wide spear Apr 24, 2021, 1:30 AM

#

Ok

#

Batch normalize means that you shift and scale the data that so that it has mean 0 and standard deviation 1

#

What sort of applied math are you interested in?

brave crypt Apr 24, 2021, 1:30 AM

#

Cool

wide spear Apr 24, 2021, 1:31 AM

#

And to do this, we need the two parameters beta and gamma

#

Does the first formula make sense now

brave crypt Apr 24, 2021, 1:31 AM

#

Got it.

#

Thank you very much 👍

wide spear Apr 24, 2021, 1:32 AM

#

Ok now for the second formula

brave crypt Apr 24, 2021, 1:33 AM

#

Yay.

#

Thank you for this 🙂

wide spear Apr 24, 2021, 1:34 AM

#

Do you know what a loss function is

brave crypt Apr 24, 2021, 1:37 AM

#

I apologize for leaving.

#

I think it's related to gradient descendant to achieve local minima.

wide spear Apr 24, 2021, 1:39 AM

#

Yes

#

A loss function measures how bad your model is doing

brave crypt Apr 24, 2021, 1:39 AM

#

I see.

wide spear Apr 24, 2021, 1:39 AM

#

N is the total number of pixels

#

Sure, straightforward

brave crypt Apr 24, 2021, 1:39 AM

#

And would you please tell me how it measures that?

wide spear Apr 24, 2021, 1:40 AM

#

P_{n,m} is what the model predicts for pixel n at scale m

#

Notice in the model architecture in figure 2, there are three down sampling layers so you have a total of 4 scales

brave crypt Apr 24, 2021, 1:41 AM

#

Yes, I am sorry I have two questions.

#

May I ask you?

wide spear Apr 24, 2021, 1:41 AM

#

Let me finish this explanation first

brave crypt Apr 24, 2021, 1:42 AM

#

Sure. 😀

wide spear Apr 24, 2021, 1:42 AM

#

G_n is the true label for pixel n

#

In the loss function, we compute $\sum_{m=1}^4\qty(1-\sum_{n=1}^N\frac{2G_nP_{n,m}}{G_n+P_{n,m}+\eps})$

pine jettyBOT Apr 24, 2021, 1:43 AM

#

Angetenar

wide spear Apr 24, 2021, 1:43 AM

#

The outer sum is straight forwards, because we want the errors across all 4 scales right

#

Now for the thing inside

brave crypt Apr 24, 2021, 1:43 AM

#

Yes

wide spear Apr 24, 2021, 1:43 AM

#

We sum the error across all the pixels

#

Sensible

#

At each pixel

#

We have this Sorensen-Dice coefficient

#

Which is used to gauge how similar two things are

#

Tbh the paper doesn't really make sense here

#

In my opinion

#

Because you don't specify what values Gn and Pnm could take on

#

Oh hitbox

#

You are here

#

Don't you know ML

#

Have you heard of Sorensen-Dice loss before

#

Rip

#

Have fun coding

brave crypt Apr 24, 2021, 1:48 AM

#

This is how it starts:

#

class DiceLossVariants(losses.Loss):
    
    def __init__(self, *args, **kwargs):
        super_args = dict()
        if 'reduction' in kwargs.keys():
            super_args['reduction'] = kwargs['reduction']
        if 'name' in kwargs.keys():
            super_args['name'] = kwargs['name']
        super().__init__(**super_args)
        if len(args) > 0:
            self.loss_name = args[0]
        elif 'loss_name' in kwargs.keys():
            self.loss_name = kwargs['loss_name']

wide spear Apr 24, 2021, 1:48 AM

#

There's this paper

#

And it has some formulas

#

But it doesn't really define anything formally enough for the formulas to make sense

#

rEEEEEEEEEEE

#

https://arxiv.org/pdf/1905.12120.pdf

#

Here is the paper

brave crypt Apr 24, 2021, 1:49 AM

#

😀

wide spear Apr 24, 2021, 1:49 AM

#

We are looking at formula (2) right now

#

Unclear

#

Ranges are unspecified

#

I was thinking that because this is binary classification, they would be 0 or 1

#

Which is why there would be an epsilon in the denominator

brave crypt Apr 24, 2021, 1:51 AM

#

Lol

#

Yes, there is dilation.

wide spear Apr 24, 2021, 1:54 AM

#

The prediction for pixel n at scale m

brave crypt Apr 24, 2021, 1:57 AM

#

May I know what you mean by predict as big as possible?

#

Sorry, I am very new in this.

#

Yes.

#

Yes, Gradient Descendant.

#

I am sorry.

#

Sure.

#

https://github.com/digital-idiot/ML_ScratchPad

#

advanced_losses.py

#

No worries.

#

Thank you for trying 😀

#

No issues 😀

#

Though do you know why we generate accuracy matrices for test dataset?

#

From that repository, I got three matrices.

#

#

But I don't understand why we need matrices for test and training.

#

Ohh. We have used that for other datasets.

brave crypt Apr 24, 2021, 2:13 AM

#

brave crypt

This is on my local machine.

#

#

I am sorry, I didn't know the right word.

#

I see.

#

One last question:

#

I got IoU: 0.42 for train, test and validation.

#

Is that bad?

#

😀

#

I see.

#

Thank you for your help 😀

#

@wide spear Thank you for explaining me things patiently.

#

😊

wide spear Apr 24, 2021, 2:17 AM

#

catThumbsUp

#

Lol rip hitbox

#

Oh nevermind

#

PaimonSpinner

brave crypt Apr 24, 2021, 2:18 AM

#

Haha

#

Nobody likes Tensorflow.

#

May I ask one more question?

wide spear Apr 24, 2021, 2:29 AM

#

yes

brave crypt Apr 24, 2021, 2:30 AM

#

Thanks

#

Ohh, I figured it out.

#

Thanks @wide spear 😀

wide spear Apr 24, 2021, 2:31 AM

#

catThumbsUp

slow niche Apr 24, 2021, 11:27 AM

#

Does anyone know what a "dependence set" is? (in the context of a matrix).

#

I'm reading a paper and the define it like:

#

for the upper characteristic we define depsU(i)as the set of all the indexes j such that the coefficient mi,j of the matrix M (of the linearlayer) is non-zero.

#

#

For the matrix:

They obtain the sets:

#

#

It seems like the i coefficients correspond to the positions of the 1's in the matrix, but I don't understand why the j coefficients are offset

#

for example, in deps(0,j) , 0, 2, 3 correspond to 1s in the first row of the matrix

#

but where do (j+2)%4) or (j+1)%4) come from?

#

the paper doesn't really go into any more detail than what ive posted so i am lost on how they derived depsU

wide spear Apr 24, 2021, 1:50 PM

#

Is this in the context of finite difference methods for pdes

naive creek Apr 24, 2021, 2:36 PM

#

Warning: long message
Hello, I'm using gmres in MATLAB for preconditioning on 2 specific sparse matrices A1 and A2. Their sparsity pattern, generated with spy is shown in the screenshot below. For both matrices A1 and A2 the following code is executed (A1 and A2 are A):

x0 = gmres(A,b);

y = gmres(@(x) A*( solve_Ub( U, solve_Lb(L,x))),b);
x1 = solve_Ub( U, solve_Lb(L,y));

LU is the incomplete lu factorisation of A. solve_Ub and solve_Lb are specific functions which return y for which Uy=b and Ly=b respectively. x0 is the solution of Ax=b where no preconditioner is used and x1 is the solution of the same system but with the incomplete LU-factorisation as a preconditioner.

Executing the code generates the following output for A1:

gmres stopped at iteration 10 without converging to the desired tolerance 1e-06
because the maximum number of iterations was reached.
The iterate returned (number 10) has relative residual 0.29.

gmres converged at iteration 2 to a solution with relative residual 8.2e-13.

for A2:

gmres stopped at iteration 10 without converging to the desired tolerance 1e-06
because the maximum number of iterations was reached.
The iterate returned (number 10) has relative residual 0.77.

gmres stopped at iteration 10 without converging to the desired tolerance 1e-06
because the maximum number of iterations was reached.
The iterate returned (number 10) has relative residual 0.086.

So the preconditioner has a good effect on A1 and already converges at iteration 2, but with with A2 the preconditioning doesn't improve the convergence here. My question is: Does anybody know why this helps relatively good for A1 and not for A2? I thought that A2 might be bad conditioned, but by calculating the condition number for A1 and A2, I came to the conclusion that A1 even has a bigger condition number than A2. Sorry for this long question, but thanks for reading this. Any sugestion would be much appreciated!

slow niche Apr 24, 2021, 2:40 PM

#

wide spear Is this in the context of finite difference methods for pdes

No, cryptography

wide spear Apr 24, 2021, 2:41 PM

#

Tiboat

#

Are you looping with the LU? If so, are you using a sparse LU?

#

You might be getting a lot of fill in for the second one

prime kraken Apr 24, 2021, 2:43 PM

#

in general, most preconditioning* methods destroy the sparsity :x

#

you might do better with something like a jacobi preconditioning

wide spear Apr 24, 2021, 2:44 PM

#

slow niche No, cryptography

Ok it looks like the j is offset in backwards order

naive creek Apr 24, 2021, 2:44 PM

#

wide spear Are you looping with the LU? If so, are you using a sparse LU?

the LU is sparse

wide spear Apr 24, 2021, 2:44 PM

#

So it’s like (i, j+(4-i) mod 4)

#

I don’t think the preconditioning is being applied to the matrix

naive creek Apr 24, 2021, 2:46 PM

#

I don't really get what you mean by looping the LU and also with the fill in for the second one 😦 . (I'm a big noob in this, sorry)

wide spear Apr 24, 2021, 2:46 PM

#

It’s being used to find an initial guess for a solution?

slow niche Apr 24, 2021, 2:47 PM

#

wide spear Ok it looks like the j is offset in backwards order

What do you mean exactly?

wide spear Apr 24, 2021, 2:47 PM

#

wide spear So it’s like (i, j+(4-i) mod 4)

.

naive creek Apr 24, 2021, 2:48 PM

#

Well, I'm not practically using this to solve a system of linear equations for some purpose. This is more like a thoeretical question i'm trying to solve.

wide spear Apr 24, 2021, 2:48 PM

#

Let me think about this more when I get out of bed

naive creek Apr 24, 2021, 2:49 PM

#

ok thanks 🙂 haha

slow niche Apr 24, 2021, 2:50 PM

#

wide spear I don’t think the preconditioning is being applied to the matrix

i'm not sure it holds for this second case:

matrix M^-1:

sets:

#

naive creek Apr 24, 2021, 2:50 PM

#

prime kraken in general, most preconditioning* methods destroy the sparsity :x

Thanks for the information! Does the destroying of sparsity have a relation with why it could wokr with A1 and not with A2, if you would know?

prime kraken Apr 24, 2021, 2:52 PM

#

not necessarily, but it makes stuff a lot slower to compute

slow niche Apr 24, 2021, 2:53 PM

#

slow niche

i guess here it is just (i, j+i % 4) - the real issue is why they do that 😄

#

for both U and L

#

I dont really understand what they mean by "coefficient" in the def:

naive creek Apr 24, 2021, 2:54 PM

#

prime kraken not necessarily, but it makes stuff a lot slower to compute

Ok thank you! Than I know already that that isn't the solution for the answer

slow niche Apr 24, 2021, 2:55 PM

#

we define depsU(i) as the set of all the indexes j such that the coefficient m i,j of the matrix M (of the linear layer) is non-zero

wide spear Apr 24, 2021, 3:02 PM

#

Ok

#

tiboat

#

You are using GMRES to determine an initial guess for your iteration?

#

I'm not entirely sure what your Matlab is doing

#

Mostly because I'm not very familiar with it

wide spear Apr 24, 2021, 3:03 PM

#

slow niche ``` we define depsU(i) as the set of all the indexes j such that the coefficient...

I have no clue what's going on either

slow niche Apr 24, 2021, 3:04 PM

#

😦

wide spear Apr 24, 2021, 3:06 PM

#

You might consider asking in the CS server linked in #old-network , they probably have people who know more crypto

naive creek Apr 24, 2021, 3:07 PM

#

with x0 = gmres(A,b); I'm trying to get an approximate solution of Ax=b

wide spear Apr 24, 2021, 3:07 PM

#

Yes

#

I understand that

naive creek Apr 24, 2021, 3:08 PM

#

y = gmres(@(x) A*( solve_Ub( U, solve_Lb(L,x))),b); x1 = solve_Ub( U, solve_Lb(L,y)); also but with a right preconditioner LU

wide spear Apr 24, 2021, 3:09 PM

#

What does @(x) mean

#

And what do you do with x1

#

Do you do this entire thing again?

#

Or is x1 what you calculate the relative residual with

naive creek Apr 24, 2021, 3:10 PM

#

wide spear What does @(x) mean

there you kind of make a function with variable x

wide spear Apr 24, 2021, 3:10 PM

#

Ok y is a function of x?

naive creek Apr 24, 2021, 3:11 PM

#

yes

wide spear Apr 24, 2021, 3:11 PM

#

But you don't pass in x when you call y in x1

naive creek Apr 24, 2021, 3:19 PM

#

um no... I'm not even able to properly understand this haha 😓 . Maybe to give a bit more context: this question with the code was given for an assignment. We actually didn't see anything about gmres in the course. for this question we also can interprete gmres largely as a black box. So tbh I don't precisely know what's going with the gmres. The answer for the question also is probably not very related to the specific gmres method.

wide spear Apr 24, 2021, 3:20 PM

#

Ok so we should be thinking about qualitative differences between the two matrices?

naive creek Apr 24, 2021, 3:20 PM

#

I think so, yes

prime kraken Apr 24, 2021, 3:25 PM

#

then maybe it's a matter of looking at the error between the original matrix and the incomplete LU?

#

you won't have a good preconditioning if the error between the original mat and the incomplete LU is close to the original matrix

naive creek Apr 24, 2021, 3:27 PM

#

Aha, that's a very hopeful suggestion. I will see what the error is!🙂

prime kraken Apr 24, 2021, 3:27 PM

#

particularly, the frobenius norm of the difference

#

or also the condition number of the preconditioned problem

#

see if it improves as much as the other one's does

#

(which should also be related to how good the incomplete LU was)

naive creek Apr 24, 2021, 3:52 PM

#

The forbenius norm of the difference A1 is 0.594262658318646 and for A2 is 4.088316191164535e+04. So that's nice! Unfortunately I didn't see the forbenius norm in the course. So if you want to take the condition number of the preconditioned problem, what do you take the condition number of exactly? Thanks for all the help already! and sorry for the late answer

wide spear Apr 24, 2021, 3:53 PM

#

Have you seen matrix norms before?

#

If so, which ones have you seen

naive creek Apr 24, 2021, 3:56 PM

#

Yes i've seen the 1-norm, 2-norm and infinity-norm.

wide spear Apr 24, 2021, 3:57 PM

#

Ok

naive creek Apr 24, 2021, 3:57 PM

#

oh wait is frobenius norm = 2-norm?

wide spear Apr 24, 2021, 3:57 PM

#

It is a fact that $\norm{A}_2\leq\norm{A}_f\leq\sqrt{r}\norm{A}_2$

pine jettyBOT Apr 24, 2021, 3:57 PM

#

Angetenar

naive creek Apr 24, 2021, 3:57 PM

#

ah nevermind

wide spear Apr 24, 2021, 3:57 PM

#

Where r is the rank of the matrix

prime kraken Apr 24, 2021, 3:58 PM

#

well, the 2-norm of a matrix is an induced norm, and it corresponds to the largest singular value

wide spear Apr 24, 2021, 3:58 PM

#

So you can also compute the 2-norm of the difference

prime kraken Apr 24, 2021, 3:58 PM

#

oh you had already started, oops

#

anywho, the frobenius norm is the square root of the sum of singular values squared, while the 2 norm is the largest singular value of the matrix, and they follow the property angetenar gave up there

wide spear Apr 24, 2021, 3:59 PM

#

A lot of matrix norms have these equivalency properties

#

Actually, all matrix norms are equivalent

prime kraken Apr 24, 2021, 4:00 PM

#

so if you compute the frobenius norm of the original matrix and the frobenius norm of the difference, you can get an idea of how bad the estimate is

#

equivalence of norms moment coming up

wide spear Apr 24, 2021, 4:00 PM

#

For any two matrix norms you have the inequalities $c_1\norm{A}{\alpha}\leq\norm{A}{\beta}\leq c_2\norm{A}_{\alpha}$

pine jettyBOT Apr 24, 2021, 4:00 PM

#

Angetenar

wide spear Apr 24, 2021, 4:00 PM

#

For arbitrary matrix norms

#

The moral is that the specific matrix norm you choose doesn't matter so much

prime kraken Apr 24, 2021, 4:01 PM

#

how concrete and bullet proof must your explanation of this problem be?

naive creek Apr 24, 2021, 4:02 PM

#

Thanks for all the information!! 😊

prime kraken Apr 24, 2021, 4:02 PM

#

at a handwavy level, i would say looking at the frobenius norm of the difference and the condition number before and after preconditioning should give a good idea of what's going on

#

or maybe relative frobenius norm difference

#

something like that

#

(or any other norm you like, as ange said)

#

oh, and as it turns out, it IS a consequence of ruining the sparsity

#

i had just missed the part where you said incomplete LU

#

since you have some M = LU - R, and LU is the incomplete LU decomp

#

the usual LU would destroy the sparsity, so you subtract this U term

#

and impose sparsity on L and U

#

this R takes up all the error

#

which you put in by making the LU incomplete to try and preserve the sparsity

#

when you do the frobenius norm of the error M - LU, it's really the norm of U that you see

naive creek Apr 24, 2021, 4:08 PM

#

Thanks for all the help and information @prime kraken and @wide spear ! I think I'm getting it

brave crypt Apr 25, 2021, 10:32 AM

#

https://www.reddit.com/r/CFD/comments/hbmzhf/fluid_simulation_using_a_discrete_lagrangian/
Can someone share the mathematical model of this?

r/CFD - Fluid simulation using a discrete Lagrangian semi-autonomou...

133 votes and 9 comments so far on Reddit

wide spear Apr 25, 2021, 9:12 PM

#

@echo ferry here fits better

#

Anyways

echo ferry Apr 25, 2021, 9:12 PM

#

thank you, I'll move it here

wide spear Apr 25, 2021, 9:12 PM

#

I do not know the answer to your question

echo ferry Apr 25, 2021, 9:12 PM

#

Sorry if this is wrong place for it, but that's the one that was recommended to me, and I don't think that early university statistics apply (although I'm a computer science PhD, perhaps for mathematicians that is trivial)
Below formula comes from "Conjugate Bayesian analysis of the Gaussian distribution" available at https://www.cs.ubc.ca/~murphyk/Papers/bayesGauss.pdf
In Normal-Inverse Wishart prior, what is the role of kappa_0?
In my case, I'm using it as part of Bayesian Rose Trees clustering. It is my understanding that NIW is used to model gaussian distribution in order to differentiate between clusters using the assumption that each cluster should follow gaussian distribution. Is it then correct that for given data and given Sigma (how to estimate Sigma is another problem), kappa_0 should be selected such that it would make Sigma/kappa_0 equal to the predicted variance of the distribution?

wide spear Apr 25, 2021, 9:13 PM

#

@prime kraken might know

#

If nobody knows here you might try asking in the AI/ML server linked in #old-network

#

(Edd is also asleep right now I think)

echo ferry Apr 25, 2021, 9:15 PM

#

yeah, I'm there, so I'll do that if no one here will be able to answer. But I think I need more of a math understanding that ML-based one. Cause, to be honest, I could just optimize it as a hyperparameter. But well, then I still wont know what it means, really

#

and my intuition tells me that I can actually pick a proper value based on the context of the application, I just have to... calibrate myself

wide spear Apr 25, 2021, 9:46 PM

#

Does 8da know

#

I guess not

brave crypt Apr 25, 2021, 9:49 PM

#

Ah, is this the problem of picking priors in bayesian stats? 🙂 I'm not very good at bayesian stats, but in general I think picking (the parameters to) a prior is not a simple thing. (Unless there is something specific about this problem where a certain way to choose is typically taken)

#

Indeed, I don't know 😞 but it could be fun to talk about this problem 🙂

wide spear Apr 25, 2021, 9:51 PM

#

I've asked my boyfie

#

Will update if he responds

echo ferry Apr 25, 2021, 9:52 PM

#

Hmmm in my particular case, the samples are event times. Clusters correspond to trips, days, series of photos

#

(depending on the level of hierarchy)

#

so a single gaussian distribution should represent a distribution of timestamps

#

Therefore some reasonable heuristics could be used based on the fact that people, for example, sleep for like 8 hours a day

#

this kappa_0 parameter in case of clustering is called a "scale factor", so I guess that I could probably select a scale that corresponds to the variance of timestamps during a single day

#

If my understanding is correct, the eq. 246 talks about the relation to the normal distribution with given mean and given variance

#

But then again, same paper uses italic N to denote normal distribution. I assume that's just a negligence. If what I'm saying is correct, then I can pick correct value of kappa_0 if I manage to get the Sigma right, which is another problem. But so far I don't really know whether my intuition is even correct, this is the first time I even heard of Wishart and my knowledge about priors in general is superficial.

wide spear Apr 25, 2021, 11:16 PM

#

Did you read the caption for figure 4

prime kraken Apr 26, 2021, 4:03 AM

#

i know little to nothing about bayesian estimation :( however, all throughout the manuscript, kappa_0 is the "belief that the prior mean is correct"

#

kappa_0 reduces the variance of the prior, meaning random realizations of the random process for the mean are closer to it, and increases the weight of the current (mu - mu_0)^T Sigma^-1 (mu - mu_0)

#

since one usually uses log likelihood for this, the argumentof the exp comes out and this works almost like a proximal mapping

echo ferry Apr 26, 2021, 9:53 AM

#

wide spear Did you read the caption for figure 4

The one in context of Normal Inverse Chi Squared prior rather than Normal Inverse Wishart prior? Yeah, I did.
As Edd says, it's described there as "how strongly we believe that mi_0 is the prior mean. And they use values like "1.0" or "5.0". So I don't think that the value corresponds to, say, probability. I lack the intuition to properly scale kappa_0.
So you're saying that I can assume that Sigma in eq. 246 stands for variance and therefore I can think of kappa_0 as a way to change the variance of the model. Since in eq. 245 Sigma is equivalent to Inverse Wishart distribution of Lambda^(-1). Wherein Lambda_n "plays role of" posterior sum of squares (eq. 257). So some variance times how strongly we believe in it. So I assume that in fact the "how strongly I believe" is actually measured by the relation between ni and kappa. Is that correct reasoning?
I see, that ni_0 is actually a number of degrees of freedom.
I do actually use that in log, as you've said.
I have to do some thinking to be able to actually put correct value in kappa_0, but you've given me a framework to work with, thank you! 🙂

prime kraken Apr 26, 2021, 10:31 AM

#

for completeness, it's like a proximal based on the mahalanobis distance, since the difference between the prior mean and the current mean has that simga^-1 in the middle, so the coordinates are rescaled based on that inverse covariance

#

the interpretation becomes a bit muddy, though, because both this "proximal" term and the other term in the cost function depend on lambda

echo ferry Apr 26, 2021, 11:20 AM

#

After some discussion elsewhere, I've become quite convinced that kappa_0 should respond to the number of samples in previous experiments that led to current parameters. So it's a measure of inertia of curret hyperparameters of my model. So I have to focus on different hyperparameters first, and then I will be able to say how far the distribution could change as a result of further data. Do you see any flaw in that?

prime kraken Apr 26, 2021, 11:40 AM

#

yep, that's an alternative interpretation to it

#

if kappa_0 is large, the model willpreferentially stick to the first guess of the mean, i.e. mu_0

#

if you have lots of previous samples, one would expect mu_0 to be a good guess that need not be modified mich

thin vapor Apr 26, 2021, 12:18 PM

#

Hey so i have been asked to firstly write if the LMM is explicit or implicit $y_{n+1} - y_{n} = h(1/2f_{n+1} + 1/2f_{n})$

pine jettyBOT Apr 26, 2021, 12:18 PM

#

B1GW0LF

thin vapor Apr 26, 2021, 12:19 PM

#

I have been looking through the lecture notes for 2 hrs at this point and I am just getting more and more confused

brave crypt Apr 26, 2021, 12:22 PM

#

,w LMM

pine jettyBOT Apr 26, 2021, 12:22 PM

#

Results provided by WolframAlpha

Click here to refine your query online
Upgrade to WolframAlpha Pro!

thin vapor Apr 26, 2021, 12:27 PM

#

yes

#

I think that it is Implicit

#

but i am not 100% confident on why

pine jettyBOT Apr 26, 2021, 1:30 PM

#

whzup

echo ferry Apr 26, 2021, 1:52 PM

#

prime kraken yep, that's an alternative interpretation to it

In my case I don't have any previous examples, but that lets me pick a proper value for this parameter when I will establish other ones 🙂 so thanks!

brave crypt Apr 26, 2021, 9:25 PM

#

@wide spear I hope you remember me.

brave crypt Apr 26, 2021, 9:26 PM

#

wide spear https://arxiv.org/pdf/1905.12120.pdf

We discussed about this paper 😀

wide spear Apr 26, 2021, 9:26 PM

#

Yes

#

The bad paper

brave crypt Apr 26, 2021, 9:27 PM

#

Haha

#

Yes 😅