#numerical-analysis

1 messages · Page 26 of 1

viscid crag
#

do we assume?

#

rounding to nearest

#

occurs by the machine?

fathom rain
#

and did you mean 1/y is less than machine epsilon or y less than machine epsilon?

viscid crag
#

y

#

do we round to nearest for y?

#

if i can figure that out question is ez, since 1/y is allowed to have more decimal digits than the mantissa

#

of the the two numbers being dividied

fathom rain
viscid crag
#

its a rounding method when u cant represent number w floating point precison

fathom rain
#

ok so you mean smallest possible number of the data type? ```
julia> floatmin(Float64)
2.2250738585072014e-308

julia> eps(Float64)
2.220446049250313e-16

viscid crag
#

oh wait i figured question out im dumb

fathom rain
#
julia> 1/(1E-400)
Inf

😛

viscid crag
brave crypt
#

would DFAs and regex conversion fit into here?

#

I just need to check if this NFA describes (ab)*(ba)*+a(a)*

#

I hand verified most of the cases but I'm concerned I missed something

sharp peak
#

@brave crypt I'm not completely familiar with that notation, but if + is an union then aren't you missing the "a" word?

brave crypt
#

Aw dammit, okay I have an older variation but I thought it didn't really make sense

#

and yes + is union, its notation from Miruoka

sharp peak
#

As a deterministic automaton it doesn't make sense because q0 has two paths with the same input, I think it might be missing an $\epsilon$ path between q3 and q2, but it's been a while since I studied this stuff

pine jettyBOT
#

nanashi

brave crypt
#

Oh, do NFAs need to abide by that condition as well?

sharp peak
#

Sorry for not being of much help

#

No

brave crypt
#

thats what I was assuming from the start

#

oh

#

okay then this would be acceptable as an NFA conversion?

sharp peak
#

Do you not require to make explicit the empty string paths?

brave crypt
#

my prof wasn't really clear on that, and the book wasn't making it to clear if they wanted empty strings either

#

i see it in explicit calculations when they do some abstract stuff, but otherwise there is no other mention

sharp peak
#

I remember it being more or less a must because not every path was connected by empty strings, but I'd probably lie if I asserted anything, I don't think I'll be of much help

#

Sorry

brave crypt
#

alright, well thank you for reminding me of that bit about NFAs, I'll think I'll go with the second variation and see if there is much else to be done

brave crypt
#

aight i've got one more for checking over if anyone else is interested, I tried to use some minimizing techniques from JFLAP to get it down a bit more:
$((a^*b^a^)^b)^$

pine jettyBOT
#

Manzareh

brave crypt
#

I spent some time today optimizing the second and the first regexes to see if it would be easier to check my work and all I was able to get down to was $(ab)^(ba)^ \cup a^*$ and $(a^b)^$

pine jettyBOT
#

Manzareh

brave crypt
#

it appears with JFLAP the generation of NFAs with the regex to NFA converter tends to get less noisy (with all the empty states) as the regex gets shorter which mostly lines up with what I've read, however I still don't really find either of these solutions particularly satisfying

hardy pier
#

here's an equation I was able to come up with for such a vector field, idk if it's even correct, I'm assuming it is lol
[b-a] is the vector from a to b
R is the radius of the circle

vivid raven
#

Hi does anyone here know lambda calculus?

wide spear
vivid raven
pine jettyBOT
pine jettyBOT
fathom shell
#

Anyone good with secant method that can help to determine the radius of a cone with surface area $S=1200 m^2$ and height $ h=20m$? Im not sure what my function should be in this case

pine jettyBOT
wide spear
#

It looks like you need help with coding

merry tinsel
#

indeed

#

thanks!

green locust
#

I don't know if this is the good channel to ask, I don't know if there's one (the CS discord doesn't seem to have one either)

#

I have this exercise, I have the intuitive idea but I really struggle with properly writing it down and formulating the argument

#

Let $s, k \in K<<A^>>$ be proper series. Show that $(s + t)^ = s^(ts^)^*$

pine jettyBOT
#

Philémon

green locust
#

These are formal series

#

And a proper series is a series over a semiring $K$, so a function $s : A^* \to K$ such that $s(\epsilon) = 0_K$

pine jettyBOT
#

Philémon

wide spear
green locust
#

Well, this is computer science though

#

That's why I went on this channel

civic ledge
#

hi

#

anyone familiar with taylor series here?

#

i wanted to know what kind of projects can be done under the said topic

wide spear
#

What sorts of projects are you interested in

civic ledge
#

any

#

a mini project to be precise @wide spear

wide spear
#

What level of math are you at

civic ledge
#

school grad

#

i mean college level

#

or maybe eulers method

#

not necessarily taylors

#

Bisection method or central difference, forward or backward difference also..

#

You there @wide spear ?

fathom rain
#

how big is a "mini project"

civic ledge
#

not that big,

#

u could refer this

#

its more or less like an investigation or a research about a particular topic that could be used in engineering applications

fathom rain
#

If that is your teacher, maybe do something on finite volume methods? he seems to be into fluid mechnaics

civic ledge
#

nope he isn't

#

one professor suggested me the webiste

#

to refer*

fathom rain
#

ok

civic ledge
#

while doing the project

#

eulers method
not necessarily taylors
Bisection method or central difference, forward or backward difference also..

#

based on these topics

#

or related to computational

fathom rain
#

I have ideas related to my research, but I don't have much time to explain stuff and I don't know your background. But I can give you easy to digest references

civic ledge
#

tq

fathom rain
#

I am pretty sure your teacher has never heard of that 🙂

#

(it is kind of an advanced form of Fourier analysis)

#

but easy to learn the basics for somebody with like msc background

#

just tell me if it sounds interesting and i can give you references

civic ledge
#

yea sounds cool

#

but the problem is i dont really have a msc background

#

still UG

fathom rain
#

Other people should come with suggestions 🙂

wide spear
#

You can do perturbation analysis for a differential equation

#

This also touches upon Taylor series which you mentioned earlier

#

Say you have a differential equation $u''+\eps u=u^2$ or something which has a solution $u(t,\eps)$ and then you solve first for $\eps=0$ and then perform a Taylor series expansion of $u$ around $\eps=0$

pine jettyBOT
#

Angetenar

wide spear
#

And then you can compare this with what you get numerically

lusty spire
#

Hi everyone. I was wondering if anyone's got any resource recommendations for transitioning from applied math graduate school into industry. Graduating in May and I'm starting the job search!

wide spear
#

What is your area of specialization

#

What do you want to do in industry

lusty spire
lusty spire
wide spear
#

Ok if you want to work in aerospace, brush up finite element stuff

lusty spire
#

United States

fathom rain
wide spear
#

If you want to work at Intel on something like MKL then you should be familiar with how algorithms work in hardware contexts

lusty spire
#

This is great information, first off thank you

prime kraken
#

on signal processing, i can recommend petre stoica's spectral analysis of signals and yonina eldar's compressed sensing book (i forget the name)

wide spear
#

This is some stuff that Intel people are doing

fathom rain
#

I would say if you want to work in aerospace it is almost expected to have a PhD, at least from my experience

wide spear
#

If Intel interests you, then Nvidia and AMD are also doing similar things

lusty spire
#

I'd say working in that realm (Intel) is my top pick

wide spear
#

Nvidia is more GPU focused, do you have experience with cuda and stuff

lusty spire
#

I do not

lusty spire
#

I have a class this semester where I can read up on basically anything I want, so if you have any recommendations for what I could read into for that area I'd be grateful!

wide spear
#

Also communication avoiding algorithms are a big deal at Intel and co

lusty spire
#

Okay, I'm going to look into that and their MKL. Thank you for the info!

fathom rain
#

maybe mixed precision computing can be of interest?

loud bough
#

Any ideas on how to start this? Does it just mean we need k terms of the Taylor series to get a reasonable estimate of the root?

raven dew
#

As for how to start it - one method is to just write x_(n+1) - r in terms of x_n and r and expand f(x_n) and f'(x_n) as taylor polynomials with enough terms that you get something non-zero; stuff will start to cancel out and you'll be left with the quadratic convergence required

#

(I'd use r = 0 WLOG tbh)

raven dew
#

np

hybrid parrot
#

@past pebble you're solidly in qfin right? maybe you can help @tardy ember

#

hes a phd math looking into qfin industry

past pebble
hybrid parrot
#

oh no I just thought to introduce @tardy ember to someone familiar with qfin industry

tardy ember
#

Hello!

past pebble
#

hello

#

i should say though that qfin is very varied and the term itself thus rather meaningless (not to mention that finance itself is all about detail so often you'll find that "good firms" exist in the same sense as "good universities", i.e. comparison makes more sense at a team/group level)

tardy ember
#

Interesting, that makes sense

hybrid parrot
#

some firms i more associate with papers than others though

#

😏

past pebble
#

well, e.g. AQR is very academic (lots of staff having PhDs) and publish plenty of whitepapers (probably not much in the peer reviewed space), but more from an econ perspective

#

and I don't think there's enough math to keep a math PhD entertained in the smart beta/factor investing land, though I'm ignorant of the deets

#

lopez de prado was there for a bit trying to bring ML to the space, but I guess it didn't work

hybrid parrot
#

lopez do prado, wherever he is now, has some neat results on hitting times and analytical optimal entry/exit

#

ive seen aqr names on a bunch of arxiv papers so it sounds like a place for phds yeah

tardy ember
#

Makes sense

#

For what it's worth I'm largely accepting that I'll probably enter into qfin (or really anything other than academia) on its terms rather than my own if that makes sense. Whether I find the work as interesting as my current stuff, it does still seem like somewhat interesting work, pays well, etc. Hence why I'd also consider e.g. trading.

past pebble
#

sure, but there's plenty of firms doing very different things is all, finance is all about fine detail and finding a niche and being good at it

#

people often think, and this is even more true on all the internet forums, that finance is all about stocks, etfs, or exchange traded products

hybrid parrot
#

uhh wheres that quant shop with lots of ex-msri peeps, some still publishing non-quant papers

#

always funny seeing a rando result in eg protein folding coming out of a quant

past pebble
#

i believe e.g. de shaw (the person) is doing protein folding himself nowadays, not finance

wide spear
#

DE Shaw has a research division

#

There's also Renaissance

#

Which was founded by Jim Simons

past pebble
#

and as for trading, there's a vast spectrum there too. one could just execute (for a PM/signal), or you could be taking views of the market (so being a PM, or being part of designing the strategy), or something in between or both; and again one might ask if a quant trader is someone guiding trading algorithms, or someone trading high touch illiquid derivatives

tardy ember
#

@past pebble sorry I was out getting food

#

So right now I'm looking at trading and/or QR internships mostly. I can't quite do AQR or SIG until summer 2023

#

Also don't know coding or stats so I don't know if research is at all accessible in the near future

#

Been looking at Jane Street, also today was told about De Shaw so maybe that too

wide spear
#

There's also Optiver and 2 sigma and 5 rings

#

Hudson river trading

hybrid parrot
#

Flow Traders

#

Ginkgo

#

Jump Trading or w/e its called

#

Point72

tardy ember
#

I'll keep these names in mind

hollow hare
#

how can ODEs like y'(x) = y(sin(x)) for y(0) = y0 be solved?

#

solved numerically

#

are there algorithms for such ODEs?

pine jettyBOT
#

itsmeyash31

dawn viper
#

ig since |sin x| < |x|

#

you can numerically solve it

#

otherwise monkagiga monkagiga monkagiga

prime kraken
#

isnt this separable

#

or am i missing smth

random hornet
#

They probably don't mean y\times sin(x) on the right, but rather that y depends on sin(x) or sth

#

The latter is not immediately sensible to me though

prime kraken
#

ah

languid gate
#

a closed form sol will exist in this case?

wide spear
#

No it won't

#

Why would a closed form solution exist

#

Anyways

#

I don't see what the issue is

#

You just solve it numerically

hollow hare
#

`but how do you solve it numerically? I don't think it is that straightforward

wide spear
#

Forward euler?

#

I can write up the details later but you just get a time marching procedure

mint hemlock
#

y'(0) = y(sin(0) = y(0)
y(delta t) = y(0) + y'(0)*delta t + O(delta t^2)
y'(delta t) = y(sin(delta t)) = y(delta t + O(delta t^3)) ~ y(delta t)
and so on

#

that could be wrong but that's my first instinct to check

wide spear
#

If sin(t) is not a point that you have explicitly you can just compute it with linear/polynomial interpolation

dawn viper
cinder blaze
#

hi everyone, I'm trying to simulate a physic system in which I smoothly transport a particle by slowly changing the hamiltonian. I'm using the scipy package on python with eigh function to diagonalize. When I take the inner product between neighbors states (that should be nearly identical) I get negative signs in some cases, like in here:
product:(0.9966026884156858-0.05932605607477859j), psi1: (0.032943245023489+0j), psi2: (0.03339802473783493+0j)
product:(0.9966084381126055-0.0584749753051199j), psi1: (0.03339802473783493+0j), psi2: (0.03403795510530472+0j)
product:(0.9969452119066843-0.05402623734400029j), psi1: (0.03403795510530472+0j), psi2: (0.03485994413495064+0j)
product:(-0.9975294263339476+0.04629996657310793j), psi1: (0.03485994413495064+0j), psi2: (-0.035831790417432785+0j)
product:(0.9981950162523199-0.03622141578551043j), psi1: (-0.035831790417432785+0j), psi2: (-0.036901943385601044+0j)
product:(0.9987735107759269-0.024920471350837484j), psi1: (-0.036901943385601044+0j), psi2: (-0.03800549358129501+0j)
product:(0.9991503307680749-0.013362420884015513j), psi1: (-0.03800549358129501+0j), psi2: (-0.0390763133764267+0j)

#

psi1 and psi2 are the first coordinate of each eigenvector

#

any ideas?

wide spear
slow shale
#

Hi everyone, I had a question about the nelder-mead algorithm. I know that its quite slow but I'm having trouble figuring out its order. I don't think its 0, 1, 2 since its quite slow so I think its 3 or greater?

wide spear
#

By order, do you mean runtime or error

slow shale
#

Sorry I should've clarified, I mean error

wide spear
#

Not very much is known about the convergence of Nelder-Mead

#

There are some results in very special cases but overall, the picture remains fuzzy

slow shale
#

Oh I see, thanks! Is this the same case for its runtime?

wide spear
#

I mean usually with runtime, you just multiple # of iterations * cost/iteration

#

And we can't say anything about # of iterations

slow shale
#

makes sense

#

thank you

slow shale
#

I guess I also have a follow up, unrelated question. I'm being asked what optimization methods might be useful when we are stuck in ravines where the condition number of the Hessian is large. The options are preconditioning, gradient descent, newton's method, and conjugate gradient.

#

I've ruled out gradient descent because the gradient changes way too rapidly in such situations for gradient descent to be efficient

#

And more generally I think first-order methods struggle in situations like this. But I'm not sure about the other three

wide spear
#

By ravines do you mean local minima

#

I think the best thing to do here is precondition

#

Precondition is a pretty vague thing but generally it refers to techniques that make the problem easier by reducing the condition number

#

One way you can do this is transform the function being optimized by stretching/squeezing to make the level sets look more like circles

#

So the condition number of the Hessian will decrease as the gradient won't change as rapidly

slow shale
#

Oh I see. I can select multiple answers here, so do you think the other methods could also be useful? I don't think newton's would be good here since second-order methods require the inversion of a hessian and if the hessian is very ill-conditioned then that might be tough

#

I'm not sure about conjugate gradient though

wide spear
#

I mean in general, you don't have any guarantee of convergence with any of the 3 if you have non-unique minima right

slow shale
#

yeah

wide spear
#

They don't have any ways of getting out of local minima

#

If you were doing ML I would suggest hand tuning hyperparameters in your SGD

#

Lol

slow shale
#

Hmm then yeah I think you're right preconditioning might be the best

#

thank you!

#

which seems to indicate that newton's could be good here haha

wide spear
#

That doesn't help Newton get out of local minima though

lusty spire
#

Would anyone be willing to try to explain aliasing in regards to interpolatory quadrature and the Discrete Fourier Transform? I've gone to my professor on the subject twice now and his explanations aren't getting through to me

outer meteor
#

any recommendations on introductory expositions on Hilbert transform?

naive moat
naive moat
#

Unrelated question. In the paper

Kaprzyk, S. "An algebraic formulation of the simplex linear method for multiple integrals over the d-dimensional domain." Computer Physics Communications 183.1 (2012): 20-25.
Kaprzyk finds that integrals in N-D of the form

#

The context is basically that the former integral is useful for piecewise rational interpolation for integration in n-dimensional spaces, tessellated by tetrahedra

#

**My question is: ** is there a deeper meaning to a formula like this? Like, what's the role in this special structure for multidimensional integrals? I get the Jacobian determinant comes about for a change of variables, but this formula seems to have so much structure (and is a ratio of determinants!)

#

In the paper, he derives the formula via a proof by induction

#

but I'm afraid the intuition escapes me, despite the fact that the highly structured form of the result seems like there should be a deeper meaning or some intuition

#

For example, since determinants are involved, I imagine a clever change of variables should yield a Jacobian determinant times some easily integrable function that produces the results, but I can't see how to do that easily (what change of variables? how to keep track of all the integration ranges? etc)

orchid sequoia
#

Looks like Cramer's rule

sharp ore
#

What is r^(d) ? You showed us a formula for r_0 and then one for r, how are we supposed to link the two together?

naive moat
sharp ore
#

What about phi_d?

naive moat
#

Here's the additional pics to above:

naive moat
#

I mean, the part that blows my mind is that the resulting structure is independent of the integrand, so long as the function's argument is of the linear combination form

wide spear
#

It isn't though?

#

The right hand column of the determinant in the numerator is very dependent on the integrand

naive moat
sharp ore
# naive moat Here's the additional pics to above:

so I didn't do it all the way but I think the trick here is just to develop the determinant along the last column of phi_d, so you get a sum of the integrals phi_d.
To get that sum from the r_0 I think you just need to make multiple changes of variables u_d = everything between the parenthesis of phi_0 and then u_i = ... and so on iteratively

#

Uh wait, I guess you only need to do the change of variables once for u_d and proceed by induction if you suppose the formula true for d-1 variables z1,...,z_(d-1)

#

I only did a cursory check with the first change of variable but you can already see the vandermonde determinant start to appear.

naive moat
sharp ore
#

I'm not sure there's straightforward geometrical intuition. I guess both determinant and integral represent a volume so looking at what volume they represent could be something to explore, but that's not going to be something you think of when seeing the integral, it'd rather be an interpretation after the fact. I suggest trying to visualize all the volumes involved with simple examples, in geogebra for instance.
As for the "how do we get the intuition to find this formula", I think it's not about geometrical intuition but more of an analytical one: you just have to see that you can do a linear change of variables for each individual integral and see what you get, then t'recognize a pattern.

#

Maybe someone with better understanding of volumes than me will be able to give you a better answer though

loud bough
brittle atlas
outer meteor
#

@brittle atlas appreciate it. intro grad level is great.

brittle atlas
#

no problem! beyond that, probably something like Stein Singular Integrals book or another book on singular integrals would have more stuff

#

if the Duoandikoetxea book isn't what you're looking for, you could try asking in the analysis channel also

opal mason
#

I'm stuck on trying to show that there is a householder matrix for any 2 vectors with the same length in C^M

#

okay so as I was typing this I think I just got an idea.

#

I'll type it anyways if it's a dud

pine jettyBOT
opal mason
#

Anyone have any tips of how to approach this, or if my idea is correct?

wide spear
#

Are you using v for the householder matrix when you should be using u?

opal mason
#

What I basically did was just expand out $$Fx = (I - 2 \frac{vv^*}{v^*v})x$$ and got that $Fx = \alpha y$ only when $$2\frac{v^*x}{v^*v} = 1$$

pine jettyBOT
opal mason
#

So I tried showing that can be reduced to something true, i.e. 0 = 0. I was able to reduce it to $${\bar \alpha}y^*x = \alpha x^*y$$

pine jettyBOT
opal mason
#

So I just need to show that for a given x, there exists a y vector such that above holds, which works when x and y are real multiples (which I'm thinking is technically different from the scaling done by $\alpha$ (i.e. since I assumed $x \neq \alpha y$), since $\alpha = e^{i\theta}$ is a complex number when at least $\theta \neq 0$

pine jettyBOT
opal mason
#

<@&286206848099549185> miyano_cursed

wide spear
#

@opal mason I will take a closer look today

pine jettyBOT
green locust
#

Hi ! Does anyone have experience with well quasi-orders ?
I have an exercise about decidability of certain problems that uses WQOs
I'm usually terrible at computability but I think I found something convincing, so if someone is up to talk about it and shatter my dreams, don't hesitate to ping me 😄

#

Maybe that's more of a CS thing but I'm trying nonetheless

wide spear
#

You might also try in #foundations for questions about computability

green locust
#

Thanks, I'll try it !

fathom shell
#

how can i compute least squares with cholesky factorization?

Can someone tell me what im doing wrong in this process?

im supposed to get this matrix returned by the function : [6., -2., 2.] but instead im getting this [1. 0. 0. 0.]

wide spear
#

Are you solving the normal equations?

#

The normal equations say that $X^TXx=X^Ty$

pine jettyBOT
#

Marius von Hagen

wide spear
#

So you should replace rhand with np.dot(Ain.T,rhand)

#

(Also np.matmul(A,B) is preferred over np.dot(A,B) for matrix multiplication)

#

Also you should be multiplying A^TA, not AA^T

fathom shell
wide spear
#

You should be able to adapt this proof to your problem

opal mason
#

hmm link doesn't work for me

wide spear
#

It’s just very slow

proud lion
#

this is for cryptography class so I hope it's the right channel

#

but I need help with figuring out the probabilities

#

don't really get it because isn't every deck going to be unique

#

so wouldn't the probability be 1/6 for every deck

wide spear
proud lion
#

ty

fathom shell
#

when would we want to find the pseudoinverse(Moore penrose) of a non square matrix ?

wide spear
#

To solve least squares

fathom shell
# wide spear To solve least squares

Why is it generally a bad idea to use the inverse to calculate least squares ? as in why do we use cholesky or LU instead of just doing inverse?

wide spear
#

The normal equations are ill conditioned

#

Calculating the inverse is always bad

#

Also because of floating point reasons

fathom shell
#

why cant we just use pseudo invese for ill conditioned? is it because it is slower than factorization?

wide spear
#

Calculating the pseudo inverse is very slow because the SVD is slow

fathom shell
#

i see, so in rough terms its all about optimization

wide spear
#

"Computing the inverse requires more arithmetic operations than computing an LU factorization."

#

Especially when doing this for the normal equations like EmilB was doing you can do even better than LU with Cholesky

fathom rain
#

"We do not address the question of computational
efficiency, but we do note that there is evidence that using the inverse is
sometimes preferable from the performance perspective [3]."

#

🙂

#

just arguing against "always bad"

#

In general I agree 🙂

wide spear
#

Ok sure

#

If you follow [3] then they present 2 cases where A^-1 is faster than LU

#

Anyways

#

You should never solve the normal equations in the first case

hearty finch
#

Hi

#

I was wondering guys

#

How can I solve a system of equation with module operation, like for example

#

In fact, the algorithm does not need to find the solution, it needs to determine if there is a solution or not

#

I do not know if a regular Gaussian elimination would work and in case it does, I do not know how to perform the modification to it

fathom shell
#

"Interpolation is not often something that’s useful in machine learning, but rather something that we often use AI for. Feeding data to a computer and allowing it to make educated guesses for us, especially when it comes to millions of lines of data, is helpful in many different fields. "

Why is it not useful for machine learning? And i suppose by AI they mean neural networks, so why is interpolation good for NN?

prime kraken
#

with a deep enough network and enough data, the underlying structure can be inferred from the rough data

#

what's more, interpolating the input data incorrectly can make the network interpolate and extrapolate worse

#

interpolation and extrapolation don't generate any new information. they just exploit what is already there. the network already does this

fathom shell
prime kraken
#

because at best it does nothing, and at worst it produces incorrect data

#

i'm not quite sure what you mean by machine learning there

hearty finch
#

mod is supposed to be 2

#

I am doing a XOR operation system of equations

prime kraken
#

interpolation is often an optimization task that can be approached in several ways. deep learning is one of them.

hearty finch
#

Would you please tell me a little bit more?

fathom shell
prime kraken
#

linear regression does exactly what the name implies. it assumes the data follows a linear relationship. this is simply not true for most data, and so if you interpolate in such a simple fashion, you make mistakes

#

the problem is if you take this data and then give it to the network as if it were the correct data (which it isn't)

#

you then teach your network that it should do this linear interpolation, when it could do much better if you didn't confuse it

hearty finch
#

So, the linear operations should be redifined to work in Z/2Z ?

#

Sorry, I think I do not understand. It is being a little bit confusing to me

#

If R is not considered a finite field, I do think it is not, then I do not have experience with Gaussian elimination over finite fields

#

Now I get it

#

There is a way to know if a solution exists without performing the gaussian elimination?

#

The concept is amazing, it blew my mind

#

May I ask why did you have to work with this?

#

Me too hahahaha

#

"Given N friends, and given M relations of the form A -> B where A and B are friends, find if there is a partition of at most size 2 such that each member of the partition have an odd number of friends inside the partition"

#

Is it working in O(n^3)?

#

Would you mind sharing it with me?

#

No problem, thanks in advance

simple vigil
#

I suppose this isn't directly related but maybe someone will benefit... looking for something related to numerical methods/ hpc internships e.g. the NVIDIA SWE and MathWorks that entertain undergrad applications. Anyone know of anything good/ worth looking into

#

stuff similar to this

hearty finch
#

Thanks for the sharing

leaden breach
#

@brave crypt

wide spear
#

This is not the correct place for this

fathom shell
#

Can anyone help with interpreting Least squares(OLS) in matrix form, im so lost.

wide spear
#

What is your question

fathom shell
#

So i have this dataset right, where prmres is the dependent variable and the others are the independent. How would set this data up in matrices for least squares ?

wide spear
#

Ok so

#

Given $m$ data points ${y_i}$ measured at $m$ places ${x_i}$, the goal is to determine parameters ${\beta_i}$ so that $f(x,\beta)=\sum_{i=1}^n\beta_i\varphi_i(x)$ and ideally, $y_i=f(x_i,\beta)$ but we cannot have this because there are too many $y_i$ and $x_i$ for a linear relationship so we want to minimize the residual $r_i(\beta)=y_i-f(x_i,\beta)$

pine jettyBOT
#

Marius von Hagen

wide spear
#

To convert this into the matrix form, we take $X_{ij}=\varphi_j(x_i)$ and $y$ as the vector of $y_i$ so we have $\beta=(X^TX)^{-1}X^Ty$

pine jettyBOT
#

Marius von Hagen

wide spear
#

$\varphi_j$ is some function (possibly nonlinear) of $x$

pine jettyBOT
#

Marius von Hagen

fathom shell
#

can you give an example?

wide spear
#

In your case, X will be the tall skinny vector formed out of oms, konk, and nypr, and y will be the vector pmres

fathom shell
#

i get that but idk how to represent X in matrix form

wide spear
#

$X=\begin{bmatrix}5.355&2&0\5.551&2&0\5.62&1&0\\vdots&\vdots&\vdots\end{bmatrix}$

pine jettyBOT
#

Marius von Hagen

fathom shell
#

is X a coefficient matrix? @wide spear

wide spear
#

Coefficients of what?

fathom shell
#

ok so i have this python code to solve a least squares solution, it just uses np.linalg.lstq .

#

the arrays in the pic above corresponds to the this picture, where area is the independent variable

#

so what im asking if this is correct

wide spear
#

Sure

fathom shell
# wide spear Sure

are you knowledgeable on Stata? Do you know where the least squares solution is given in Stata?

wide spear
#

No

fathom shell
# wide spear No

alright thanks for the help man, i need to validate my python answers in some other program

wide spear
fathom shell
prime kraken
#

what else do you know about the model? that sounds like one of the variables is known to be scaled by 1

#

is this like a polynomial? are the quantities just scaled and added?

#

the model you have built there says that your data is of the form area = a bedrooms + b age + c price

#

and you're solving for a, b, and c

#

having an extra column of ones could be interpreted as either knowing something else about the variables, or maybe having an extra parameter to denote a base area that doesn't depend on the other variables

#

say area = a bedrooms + b age + c price + (1) d, and d is this base area

naive moat
#

Imagine I have the roots r1, r2, ... of the polynomials mentioned:

#

In particular, I have these roots, and I want to know if there's a quicker way of evaluating this (getting the rhs) than just straightforwardly

#

for k=1,2,3, ... say up to 7

prime kraken
#

you mean you want to compute both the LHS and the RHS? cuz the RHS is pretty simple

naive moat
#

I have these $r_{i_j}$

naive moat
prime kraken
#

right, so you want the LHS

naive moat
#

yes, and I'm hoping there may be a quicker way to evaluate it than just as given

#

based on these Vieta's formulas I would think there might be?

prime kraken
#

i don't think so

naive moat
#

it's likely gunna be the bottleneck to what I'm doing so I'm hoping for some way 😄

prime kraken
#

i would almost suggest you do the first few small polys with the sum and product definition

#

and then for large polynomials, actually find the coefficients and use the RHS

naive moat
naive moat
prime kraken
#

no, i mean build the poly

#

you know the roots, right=

naive moat
#

yea ok

prime kraken
#

so (x - x1)(x-x2)... is the poly

#

and the coefficients of the poly are what is used in the rhs

#

the computer does poly multiplication with convolutions, which should be rather efficient

naive moat
#

yep, that should be faster for the larger sum-prods I would think

#

ty!

prime kraken
#

you can compare it if youd like, but this way you save yourself using 2 loops and the cost is that of repeated convolutions

naive moat
#

so since it's on the computer, x is a number, it seems like I'd have to do it for like multiple x's and compute the coefficients via finite differences

prime kraken
#

what do you actually know about the polynomials

#

earlier you said you have the roots

#

if you have the roots, there's no need to do finite differences for anything here

naive moat
#

so I think I'm good

prime kraken
#

that's literally what i told you to do

#

lol

naive moat
#

yes

prime kraken
#

if you say so 😛 the name beta 0 alone means nothing to me out of context lol

fathom shell
# prime kraken if you say so 😛 the name beta 0 alone means nothing to me out of context lol

https://en.wikipedia.org/wiki/Design_matrix but yes my question was not the best, i shoulda asked how to format the linear regression model as a matrix and not just the data

In statistics and in particular in regression analysis, a design matrix, also known as model matrix or regressor matrix and often denoted by X, is a matrix of values of explanatory variables of a set of objects. Each row represents an individual object, with the successive columns corresponding to the variables and their specific values for that...

prime kraken
#

if you can provide the data model, we can set up the model matrix

#

it sounds like the assumption is that the data behaves like y = a x1 + b x2 + c x3 + x4

proud lion
#

Can someone help me with this?

prime kraken
#

you can do this in two steps. first, find the general solution in terms of all the free parameters (if any). if there are free parameters, you can then solve an optimization problem to make the sum of the coefficients zero (or some other constraint) if it's not evident how to do so at a glance

fathom shell
#

can you only plot regression with more than 1 feature by using a 3d plot?

opal mason
#

This is more of a theoretical question to something involved applied computation math, but I can't find a suitable channel so I'll drop it here>

#

My question is on subdifferentials
https://www.stat.cmu.edu/~ryantibs/convexopt-F16/scribes/subgrad-scribed.pdf

So one of the examples they give in this PDF about subgradients is the max of 2 convex, differentiable functions. They claim for x such that f_1(x) = f_2(x), the subgradient at f(x) is a convex combination of the gradients of f_1 and f_2. Geometrically this is easy to understand, but how do I go about proving this? The case when f_1(x) /= f_2(x) is pretty much trivial, but I'm not sure how to do the equals case.
The example in consideration:

#

preferably I'd like a hint for both directions (showing that it's only a subgradient if it's a cvx combination of the gradients), but a proof sketch/actual proof will do as well

prime kraken
#

first, if they are equal, they have the same subgradient

#

next, a sketch in R that you can go over to generalize to R^n 😛

#

we have the convex combination x1 a + x2 (1 - a)

#

let x2 = x1 + c, for some constant c

#

then the combination is rewritten as x1 a + (x1 + c) (1 - a) = x1 a + x1 - a x1 + c - ca = x1 + c(1-a)

#

we know 0 <= a <= 1 due to the definition of a convex comb

#

this means the result ranges from x1 to x2, i.e. either [x1,x2] or [x2,x1]

#

this is pretty much the same definition of a subgradient itself, since that is a range of slopes, so to speak, for which the subtangents lie below the graph

#

i.e. if the subgradient is [y1,y2] and x1 and x2 are in this set, then [x1,x2] is a subset of [y1,y2]

#

and graphically it's the region between two lines

#

idk if that helps you, i just woke up lol

opal mason
#

So for the 2D case, $\xi = \lambda\nabla f_1({\bf x}) + (1-\lambda)\nabla f_2({\bf x})$
Then if I suppose $\nabla f_2({\bf x}) = \nabla f_1({\bf x}) + {\bf c}$, so $c$ is some vector here
and then aaaaaa
I get $\nabla f_1({\bf x}) + {\bf c}(1-\lambda)$

prime kraken
#

mhm

pine jettyBOT
prime kraken
#

parentheses

opal mason
#

wait so is this the sketch for showing a convex combination of the gradients is a subgradient?

#

or the other way around

prime kraken
#

as you said

#

that taking a cnv comb of subgradients is a subgradient

opal mason
#

you mean cnv comb of gradients?

#

It's really late here and my thought is very slow miyano_cursed

#

I'm still failing to see where I get the subgradient from the last expression

prime kraken
#

oh i guess ur right, yeah

opal mason
#

how would I show that every subgradient at that point is a convex combination of gradients then? So starting with some subgradient, showing that it's a convex comb of grads

#

I feel like I'm missing something pandaOhNo

prime kraken
#

hmm

#

probably backwards

#

start with the subgrad at x being an interval

#

and show that a subset can be expressed as a cnvx comb

#

i don't see a good way of showing the endpoints of the interval are the grads of f1 and f2 right now, tho

vale imp
#

Anyone know why centers for rbf neural networks are chosen using k means clustering? I'm going to use an rbf neural network to approximate some non linear function and I was thinking of using gradient descent to minimize the error and treat the centers as extra parameters just like the weights. Is there any reason why that approach would be inefficient compared to k means clustering?

vale imp
#

Would it lead to overfitting because the algorithm might now assign the centers numerical values that best fit the data, but may not be the actual point about which the data is centered?

grim bobcat
#

Kind of a simple problem but I have no idea with computational stuff. I have a simplical cone (Given 3 linearly independent vectors $v_1,v_2,v_3$ in $\mathbb R^3$, the simplical cone spanned by them is the set of all linear combinations with positive coefficients), and I wanna find out in an efficient manner if a point lies inside (Including on the boundary). The points I wanna know if are inside are specifically integer points if that helps. I've thought that, this cone is actually the intersection of the 3 half spaces given by the planes spanned by every 2 vectors from $v_i$, but i'm not sure how to find which part of the half-space I should even take (And therefore which orientation of the cross product I should take). I also thought there might be an easier way to find this out using recession directions but I haven't really fleshed that thought out

pine jettyBOT
wide spear
#

I'm heading out now but I'll answer if it is unanswered when I get back

grim bobcat
#

Thanks!

prime kraken
#

the straight up naive way is to set the vectors as columns of a matrix, invert the mat, and multiply by the coordinate vector

#

you'll get the coords in the basis of the v_i and then just check if they're all positive

#

this ignores the points being integers tho

grim bobcat
#

Oh nice, I didn't consider that since I was too focused on the half-space idea 😅

#

There might be a better way but at least this works haha

#

Thanks

leaden breach
#

Hi could someone help and point me in the right direction to solve this problem:

#

I want to write a python script that does the following:

#
  1. Isolate a square root term on the LHS.
  2. Square both sides of the equation.
  3. Add and subtract terms accordingly to get 0 on LHS.
  4. Isolate another square root term on the LHS. [ i.e. step 1) ]
  5. Repeat step 2).
  6. Repeat step 3).
  7. Repeat step 1).
    ETC . . .
#

Would this be "solving equations iteratively in Python" ? (I have very little experience with code.)
(And I know I can just tell WolframAlpha to solve this for me, but it would be cool to this piece of a problem I am working on all by myself, in a way.)

wide spear
#

Another solution: truncate the cone at a point past your point of interest, then determine if the point of interest is inside this closed convex polyhedron

#

This is a classical problem that you will find many algorithms for

gaunt crag
#

is the cengage edition of sipsers introduction to the theory of computation fine

#

like does it have errors in it?

#

i bought the book for $20 new when the main published edition is for like $200

random hornet
#

Might be better asked in #book-recommendations ; and I think Cengage is the authorised publisher for both hardcover and paperbacks.

opal mason
#

More theoretical, but any ideas on how to show this is convex? (or counterexample?)

#

if i'm picturing it correctly geometrically, is this just a shifted quadratic, with lower bound of 1, which should be convex?
I've tried just a bunch of algebraic manipulation and I'm just not getting anywhere

grim bobcat
#

Has someone here worked with polyhedra in sage? there's something dumb that is driving me crazy and I can't find any documentation on

wide spear
grim bobcat
#

so when you ask for a vertex representation of a polytope it gives you a list that's like 'A vertex at (x,y,z), A vertex at (x_2,y_2,z_2)' etc.
All I want is that, but without the words, and in vector form. Just a list of the vectors making up the vertices

#

is that too much to as kfor

wide spear
grim bobcat
#

I guess

#

i'll reposti I just dk if that channel gets any traction

hazy basin
#

I have a homology theory with coefficients in GF(2) that I ultimately need to find representatives for generating sets of the homology groups. I'm guessing that this is a fairly standard problem (compute the homology groups) that there might be a python package for? A brief search shows that I might be able to find some routine for computing the Smith normal form instead?

#

*The boundary maps are sparse finite dimensional matrices

#

Is computing the Smith normal form the standard route? / Is there a package that can do that for a very sparse boundary maps with dimension ~10k x 10k ?

wide spear
#

I would ping max but max is no longer with us

marble kestrel
prime kraken
#

i think given that A is pos def, x is any vector in R^n

#

so it's convex

#

hmm

#

nvm they gave the cond that the result is >= 1

marble kestrel
#

the derivative will be $$A(x - e_1) + (x - e_1)^TA = A(x-e_1)+ (A(x-e_1))^T$$

the second derivative will be $A+ A^T= 2A$ which is the hessian, and is positive semi-definite so you’re function is convex.

pine jettyBOT
#

c squared

prime kraken
#

but you can first look for the condition for a vector v, and then let v = x-e_1

#

you sure that works? they're asking for the convexity of the set, not the func

marble kestrel
#

oh whoops. misread

maiden badger
#
2x - y = 0
x^3 + x - y = 0

am exact ans is (1, 2)

if my guess is (2, 3) and after my first iteration calculations, my values are (1.5, 3.5)
there's must be a mistake in my calculation

barren walrus
#

Hi friends I have no idea where to ask this question but maybe you guys have an answer: Is it possible to calculate partial derivatives of an pde even though it has no analytical solution? Like for example the 3dim Navier Stokes Equations?

opal mason
barren walrus
marble kestrel
#

yea i was working through it. i got it to almost work out but there were some terms that just didn’t want to disappear

wide spear
#

You'd probably want to use a sympletic solver right

#

Equations coming from physics usually conserve energy

edgy sage
#

Given an array $a$ of $n \leq 1000$ integers with $1 \leq a_i \leq 250$, compute the number of distinct subarrays of $a$.

#

I have a solution using rolling hash that seems to work, but what is the deterministic solution?

#

I feel it should use string algorithms

pine jettyBOT
#

IlIIllIIIlllIIIIllll

brave crypt
#

hey all

#

i have a problem i wanna write an algorithm for

#

but im v confused on how to start

wide spear
gaunt crag
#

what would happen if you defined a new type of automataon saying that each state needs to have >=1 outward arrows for each letter in the alphabet

#

would any new special properties arise

loud bough
#

https://math.stackexchange.com/questions/1470990/x-n1-x-n-fx-n-over-fx-0-find-c-and-s-such-that-e-n1-ce-n#

For this question on a modified Newton's method problem, I can't comment on stackexchange, but in the last line of Gordon's answer, isn't f(r) = 0, so he is dividing by zero? My own answer came out quite a bit differently, I got the same s value but C = f''(r)e0/[2f'(r)]

vale imp
#

Suppose I wanted to make a linear fit to a dataset with vector input and output, by minimizing the least square error. Then the least square error equation would be
$$E = \frac{1}{2}\sum_i(W\vec{x}^{(i)} - \vec{y}^{(i)})^2$$
And I have to find the weight matrix $W$ that would minimize this. Can't I treat this as a colelction of linear regression problems? Like the least squared error for the j'th component of the output would be
$$\frac{1}{2} \sum_i ( \vec{w}_j \cdot \vec{x}^{(i)} - y^{(i)}_j )^2$$
And each component equation is independent from the next. So can I treat these as simple linear regression problems and find the rows of the weight matrix from the one step formula of linear regression that used the pseudoinverse (I forgot the exact formula) ?

pine jettyBOT
vale imp
#

I think I can but I see other people use methods like gradient descent or particle swarm optimization to get to the minimum instead for this sort of problems with vector outputs so I'm confused.

brave crypt
#

Hi Guys, I have a scenario, I want to turn it into a mathematical model,

The scenario:
I have a car that makes a reservation for the network resources in advance (use it in the future), That is, when the car reaches the coverage of the Base station (as you can see in the attached Foto), for example, the car has been reserved resources for a while, however, this reserved time may change (update) before the car enters the coverage of Base station due to uncertainty in traffic and movement. Updating the reservation takes the following steps:

  1. cancel partial (overbook): If the reserved time is more than the time car needs it right now.

  2. Cancel all and new reservation (underbook): If the reserved time is less than the time car needs it right now.

Cancel Partial:
In the cancel partial there are different strategies, for example:
a) If 70% of the reservation is overbooked, should we cancel it all and reserve a new reservation, or cancel it partially why?
b) If 50% of the reservation is overbooked, should we cancel it all and reserve a new reservation, or cancel it partially why?
c) If 20% of the reservation is overbooked, should we cancel it all and reserve a new reservation, or cancel it partially why?

There are cancelation strategies and fees in our model:
Peak-On (i.e. Traffic flow is very high in this Base station coverage):

  1. Cancellation of reservation before entering the car the base Station coverage. The cancellation fee is 5%.
  2. Cancellation of the reservation after entering the car the base Station coverage. The cancellation fee is 20%.

Peak-Off (i.e. Traffic flow is very low in this Base station coverage):

  1. Cancellation of reservation before entering the car the base Station coverage. The cancellation fee is 0%.
  2. Cancellation of the reservation after entering the car the base Station coverage. The cancellation fee is 5%.
wide spear
#

I'm going to be honest, this question doesn't really fit here

hidden marsh
#

I have a question,

#

In class I've proved the following inequality

#

Where Ax=b and Ax^=b^

#

Now I want to come up with a proof that there exists a b thats not b^ s.t. the equality holds

#

but I'm not sure where to start

#

I can see that I can easily get the equality if I assume |A^-1| |b-b^|=|A^-1(b-b^)| and |A| |x|=|Ax|

#

but I'm sure theres a stronger assumption I can make about b

#

any help is appretiated

orchid sequoia
#

You can consider vectors that maximize the operator norms, but I don't remember the outline of the proof

limpid wagon
#

hi

#

can someone suggest some projects to be done with newton raphson method

#

a unique one infact

#

where you explain the project objective

#

methodology

#

problem statement

#

and algorithm

brave crypt
#

https://financialdata.tiiny.site/ I pasted the data that I got using python. My problem is that the PCA gives me 49 eigenvalues and I have 93 variables. Isn't that strange ? I should have as much eigenvalues as variables no ?

prime kraken
#

you have the data arranged as a matrix, yeah?

wide spear
#

Data only has 49 rows

#

You'd have 93 variables if data had 93 rows

wide spear
prime kraken
#

unless...

#

jk jk. but...

brave crypt
#

Isn’t it supposed to be related to columns ?

fathom rain
prime kraken
#

well, PCA deals with the eigenvalues of the gram matrix. you could build that either as A^TA or AA^T

brave crypt
#

A^TA

prime kraken
#

regardless, since the matrix is 49 x 93, apparently, then it can at most have rank 49

brave crypt
#

Matrix here is 93 93

prime kraken
#

yep

#

and 93-49 eigenvalues are 0

brave crypt
#

Why 94 -49 ?

prime kraken
#

as a lazy proof, you can use the SVD to prove this

#

if the original matrix is 49 x 93, let's call it A

#

then it has an SVD such that A = USV^H, where U is an orthonormal mat of size 49 x 49, S has the singular values along its main diagonal and is of size 49 x 93, and V is orthonormal and of size 93 x 93

#

then A^H A = V S^H U^H U S V^H, and sice U is orthonormal, U^H U = I

#

then S^H and S are diagonal and you get a diagonal matrix D = S^H S of size 93 x 93 with at most 49 nonzero entries

#

and A^H A = V D V^H, all matrices square of size 93 x 93, V orthonormal, and D a diagonal matrix, so this is the eigenvalue decomposition of A^H A

#

and it has the same rank as A, and the eigenvalues are the absolute value squared of the singular of A

#

the rank of A is <= min(number of rows, number of columns)

#

so the rank of A^H A follows the same property

brave crypt
#

Whats H ?

prime kraken
#

hermitian transpose

#

complex conjugate transpose

#

not needed here if the data is all real, so just replace it with a regular transpose

prime kraken
#

because of how diagonal matrices multiply

#

try this

#

take $\begin{bmatrix} a && 0 && 0 && 0 \\ 0 && b && 0 && 0 \end{bmatrix}$

pine jettyBOT
prime kraken
#

call that S

#

and calculate S^T S to convince yourself

#

this is easy to generalize to arbitrary sized diagonal matrices

limpid wagon
#

hii

#

does someone know about NR method

#

newton raphson

#

and thomas algorithm

#

@prime kraken ?

#

@wide spear

wide spear
#

If you have a question, ask it

limpid wagon
#

okay i wanted to know the applications of it

#

newton raphson

#

and thomas algorithm

wide spear
#

We aren't here to do your project for you

limpid wagon
#

ik'

#

i wanted to know what kind of projects acan be done

#

obv i have to do

prime kraken
#

anything where one has to optimize something

limpid wagon
#

okay

#

do you have suggestions?

wide spear
#

We aren't here to design your project either

brave crypt
#

they are diagonal yes but i don't the link with non zero

prime kraken
#

dude write it on a piece of paper and do it

#

and here's a related proof, too

brave crypt
#

oh i get it

prime kraken
brave crypt
#

it's because S is 49,93

#

so maximum u have 49 non zero entry right

#

for D

#

"S has the singular values along its main diagonal and is of size 49 x 93"

wide spear
#

Someone could use a linear algebra refresher

brave crypt
#

that was my problem

#

i though that S matrix was diagonal

prime kraken
#

i literally showed it above

#

do you know what an SVD is?

brave crypt
#

that is has the same rank as a ?

brave crypt
prime kraken
#

yes

#

do you know what a symmetric matrix is

brave crypt
#

y

#

A^TA is symmetric

prime kraken
#

here's a quick proof

brave crypt
#

i get his proof

#

ut now i'm stuck at the end

#

how did he get dim = dim and rand = rand from the first equality xd

wide spear
sand knoll
#

The methods in question are PCA, EM algorithm, Neural networks and Naive Bayes

#

Obiviously for PCA there is a positive effect as we want to find underlying correlations. For the other techniques, I am less familiar with them.

sand knoll
#

Perhaps for neural networks we can consider correlation as a feature to work with, so it would not really matter.

hidden marsh
#

I have a question about matrix decomp
so I know how to decompose a matrix A = L*U
where L is a lower triangular matrix and U is an upper triangular matrix
Now let A = [4,-1,-1;-1,4,-1;-1,-1,4]
I was able to decompose it into LU but one of the question asks me to decompose it to Ltrans(L)
But I don't know any method of doing that
Any help would be appretiated

wide spear
#

See Cholesky decomposition

fathom shell
#

can anyone help with gaussian quadrature

orchid sequoia
fathom shell
#

i need to estimate this integral using the gaussian-legendre quadrature formula

#

And the question is "Would you consider the result of Gaussian quadrature in this case a good approximation? If no, why do you think, this integral is difficult to approximate? "

#

but idk what to make of it

fathom shell
#

so the polynomial : (1)/(x**2+0.0001) and im integrating with range [0,1] and degree 2 and i get the result : 23.82841

So that begs the question which i dont know the answer to: Why is this a bad estimation?

fathom shell
#

to get a result of 156 i would need the degree to be n = 100, that wouldnt make sense tho as the degree is the degree of the polynomial

sage vapor
#

where are you seeing a polynomial

slow shale
#

Hi, I was sort of confused about this question and wanted to see what others had to say

#

My thinking is that as the model parameter increases, the training error decreases but the test error will increase too due to overfitting

#

For this reason I was thinking that the first one is probably not correct but, the second one sort of makes sense to me as well

hard escarp
#

Hi. Just curious. Would this be the right channel for numerical analysis stuff?

wide spear
#

Yes

near trench
#

Could questions on automata theory work here? I don't quite understand how NFAs are made.

wide spear
#

Best to ask in the CS server

untold crypt
#

I have a question to do with LaGrange interpolations. better to ask here or #advanced-analysis ?

wide spear
#

Fine here

untold crypt
#

okay i think i figured it out but i was just having trouble reasoning it out this derivative. im still not sure if i did it correctly. Here is part of my work

#

im trying to find omega prime given omega.

#

this is the prompt:

wide spear
#

Ok that looks fine

untold crypt
#

okay i was mostly concerned with the making sure everything was accounted for properly in the product rule expansion.

wide spear
cosmic karma
#

how do I build this block tridiagonal matrix in matlab? I was able to build T but not A

#

Update: I was able to build a block diagonal matrix will all matrix T's. I do not know how to include the identity matrix yet

brave crypt
#

Hi quick question: What norm should be used for the banach fix point theorem? I have a fix point iteration of the form $x^{(k+1)} = \Phi(x^{[k)}) = Ax^{(k)} + b$ where A is a matrix and b a vector, now i wanted to do:
$||\Phi(x) - \Phi(y) || = ||A(x-y) || \leq ||A|| \cdot ||x-y||$, i would like $||A||$ to be my contraction number but for some norms i get $||A|| \geq 1$

pine jettyBOT
#

Gewisser Fler

wide spear
#

Whatever norm works

#

You'll get convergence in whatever norm you choose

#

And on fin dim vector spaces, all norms are equivalent

fathom rain
brave crypt
#

I'm doing a PCA

#

I'm a little bit lost to know if I should remove outlier ?

#

and how to know if they are outlier

prime kraken
#

robust PCA is a big deal, since covariance matrices are susceptible to them

#

but sophisticated techniques are beyond your reach right now

#

you could try something simple

#

vectors whose norm differs too much from that of the others, for example, could be discarded

#

or entrywise computing the quartiles, too

brave crypt
#

@prime kraken i computed quartiles to find outlier

#

but idk if it's usefull to delete them

prime kraken
#

it is

brave crypt
prime kraken
#

no

brave crypt
#

ok x)

#

@prime kraken

#

for example for the column the mean is 36 and the max is 156

#

if he is considered as an outlier using quartiles, should i remove it ?

#

it's not a value that is crazy but it's also considered as an outlier by the strategy so idk

prime kraken
#

it also depends on what the data represents, so i can't comment

brave crypt
#

oh ok

main urchin
#

Given the error amount (higher_order-lower_order)*step_size, what is the proper way to accept or reject a step in adaptive Runge Kutta? And what is the formula to adjust the step_size for the next step? Everybody on the internet says something different...
I have the iterator like this:

const auto result     = method_type         ::apply   (problem_, step_size_);
const auto evaluation = error_evaluator_type::evaluate(problem_, step_size_, result);

if (evaluation.accept)
{
  problem_.value = result.value;
  problem_.time += step_size_;
  step_size_     = evaluation.next_step_size;
}
else
{
  step_size_     = evaluation.next_step_size;
  return operator++(); // Retry with the adapted step size.
}

now trying to implement error_evaluator_type::evaluate which returns a boolean saying whether to accept or reject the current step and the next step size, but stuck AF:

struct error_evaluator
{
  constexpr static auto absolute_tolerance = 1e-6;
  constexpr static auto relative_tolerance = 1e-3;

  template <typename problem_type, typename time_type, typename result_type>
  static error_evaluation<time_type> evaluate(
    const problem_type& problem  , 
    const time_type     step_size, 
    const result_type&  result   )
  {
    time_type error = NORM(result.error / (absolute_tolerance + relative_tolerance * MAX(problem.value, result.value)));
    return {error < time_type(1), step_size / error};
  }
};
wide spear
#

Accept in error < threshold (decided by you)

#

Formula for step size: double if error is very small, half if error is large

#

Or whatever amounts you want

sand knoll
#

I want to make a comparison of the convergence rate of different matrices for conjugate gradient vs gaussian elimination. I have the expression $\frac{|| x_k- x^* ||}{||x^*||} \sim (1- \frac{1}{\sqrt{K_A})})^k$ for the convergence of the CG method. Im wondering if there is a similar expression to be obtained for the convergence rate of gaussian elimination involving condition numbers s.t I may compare use the condition numbers to evaluate which is the quickest method?

#

I believe $\frac{|| x_k- x^* ||}{||x^*||} \sim (1- \frac{1}{K_A)})^k$ holds for steepest descent, but does it work for A = LU?

#

here $x^$ represents the true solution of $Ax^ = b$ and $x_k$ is the solution of conjugate gradient at the k-th iteration.

pine jettyBOT
#

Fredrikpiano

#

Fredrikpiano

#

Fredrikpiano

wide spear
#

The run time of Gaussian elimination doesn't depend on the condition number because it isn't an iterative method

sand knoll
#

well I believe the gaussian elimination in this context is LU decomposition

wide spear
#

You don't have a convergence rate either

#

LU decomposition isn't iterative

sand knoll
#

ahh, yeah that makes sense. But then the best comparison would be to state the time complexity for the different methods I guess

wide spear
#

Yes

sand knoll
#

right thanks

hollow pagoda
#

not sure if this is the correct place to ask but

#

how do I turn a max min objective function into a linear program?

sand knoll
#

@wide spear sorry for the ping. I kept finding different answers for the time complexity for the conjugate gradient. There is a paper titled "An Introduction to the Conjugate Gradient Method without the agonizing pain" which lists the time complexity as $O(m \sqrt{K_A})$ where m is the number of nonzero entries in the mtx. Then if the number of unknowns is equal to $1, N=1$, there would be nothing to gain, but as N increases ($N \geq 2$) the CG algorithm will always converge faster. Does this sound reasonable?

pine jettyBOT
#

Fredrikpiano

wide spear
#

Yes, that is Shewchuk's document right?

#

What is N

sand knoll
#

yes, Shewchuck. N is just to denote the dimension i .e N = 10 generates a 10x10 matrix in the matlab program I am running to generate some matrices which I want to compare the convergence rate for. I choose the infinity norm for when I am computing the condition numbers (||A^-1|| ||A||_\infty), but I could have chosen other norm as well right?

wide spear
#

Usually the condition number refers to the ratio of the largest and smallest eigenvalue

#

Clearly there is nothing to be gained when N=1

#

But you haven't said what you're comparing CG to

sand knoll
#

Comparing it to LU decomposition with time complexity O(n^3)

wide spear
#

Ok

sand knoll
#

one can relate condition numbers to a mtx being singular as well right? If the condition number is very large for example

wide spear
#

Yes

sand knoll
#

I also have the following exercise Im working on from my teachers lecture notes

#

Im thinking I should show that A has one lin. independent column and work from there

wide spear
sand knoll
#

yep

#

thanks

hollow pagoda
wide spear
#

What are you minimizing with respect to

#

What are you maximizing with respect to

hollow pagoda
#

in this case i am minimizing the i's

#

and maximizing the sum ai1 s1 ... + ain xn

wide spear
#

What do you mean by minimizing the is

#

The ai's?

hollow pagoda
#

yes, i think we are choosing an i

#

such that a_i will be minimized

#

and then we "maximize" by choosing a x

#

I think the first step would be to replace the entire

#

with "A"

#

and then try to translate this minimiztion equation into a set of constraints

#

but im not sure how

wide spear
#

Wait are you minimizing the sum of the a_i

#

What does minimize the a_i mean

#

You can rewrite that sum as Ax, yes

hollow pagoda
#

minimizing ai1 + ai2 ... + ain

#

and after that, maximizing ai1x1..ainxn

wide spear
#

Ok so first you minimize the 1-norm of A

#

And then maximize the 1-norm of Ax

hollow pagoda
#

i believe so

wide spear
#

Why do you want this as a linear program

#

What constraints do you have on A aside from Ax<=1

hollow pagoda
#

this is the only constraint

#

and this is an exercise, the rest of the numbers were fairly straightforward

wide spear
#

I see

#

Wait

#

We don't minimize abs(ai1)+abs(ai2)+... right

#

We minimize ai1+ai2+...?

#

Then there is no minimizer A

#

You can make the sum as small as you want

hollow pagoda
#

wait i might've been wrong

wide spear
#

Ok that makes more sense I think

hollow pagoda
#

so first i minimize the sum, and i pick the biggest "component"

wide spear
#

Yeah

#

I'll assume that A is fixed then

#

Then you have m linear programs to minimize a_i^Tx

#

Then you have one linear program to maximize c^Tx where x is a vector with entries summing to 1 and c is the vector of minimizers of the a_i^Tx

#

And you use the constraint when minimizing a_i^Tx

#

I think

hollow pagoda
#

ok so you're suggesting i use two linear programs?

wide spear
#

Yes

hollow pagoda
#

hmmm ok i'll try that

#

thanks!

tall solar
#

Can someone remind me about how random matrix theory is related to compressed sensing?

For example in the matrix completion problem you need to assume the entries are missing randomly. So you are basically dealing with Bernoulli random variables. Then you have a matrix of Bernoulli variables. Somehow properties about random matrices of 0 and 1s is relevant somehow. So I think random matrix theory comes in here. But I wanna know the core results that are relevant

prime kraken
#

you're interested in matrices that satisfy the restricted isometry property, then

wide spear
#

Is this what you want

prime kraken
#

looks good

tall solar
#

Thank you so much I think the tools from probability chapter is exactly what I was looking for

tropic loom
#

Would this channel fit for Automata theory?

wide spear
#

No, you should ask about automata theory in the CS server, linnked in #old-network

tropic loom
#

👍

terse flower
#

hi friends possible to compute cholesky factorization on non square matrix? 👀

#

^matlab

wide spear
terse flower
#

thank you ❀

marble kestrel
#

anybody here know machine learning?

wide spear
#

You might have better luck asking in the ML server linked in #old-network but you can always try asking here

marble kestrel
#

alr, thanks

marble kestrel
#

just going to ask here in case:
alr, so i have some data:

X1    X2  Label
0.4  0.4  Red
0.8  0.8  Red
0.2  0.4  Blue
0.4  0.2  Blue
0.8  0.2  Blue
0.2  0.8  Blue

and i have the following kernel K(x,y) = y^T x / ||y|| ||x|| and i am tasked with mapping each data point in the original feature space to the new feature space representation implicit by K(x, z).

im kind of unsure what to do...

#

just to make sure im understanding, the original feature space is the set of ordered triplets that look like
(x1, x2, label) correct?

viral tendon
#

Anyone can help me with calculating norm of inner products?

#

I've got u,v in V
V is a space of inner products

#

| | u | | =3 ; | |u+v| |=4 | |u-v| |=6

#

how do I find v?

#

or | |v| |

random hornet
prime kraken
#

for example, $\langle u, u \rangle = \Vert u \Vert^2 = 9$

pine jettyBOT
wide spear
prime kraken
#

indeed 😛

viral tendon
#

does it give the option to break it?

wide spear
prime kraken
#

check what i wrote over in linalg

#

i explained it over there

finite salmon
#

i'm using RStudio and i'm trying to analyse a dataset. one of the variables in it is "product_brand" where each entry is a brand name e.g. "APPLE". i want to make a histogram of a different variable in the dataset but i only it to consider data from a specific brand, does anyone know what i would write?

wide spear
finite salmon
#

ah sorry

hazy basin
woven shale
#

Does heun's method give the same solutions as Euler's method? (Specifically the forward euler)

#

I'm a little confused on this, because Heun's is considered a 2nd order method, so I don't see how it would give the same answer as Euler's. But if it's just an improved Euler's method, shouldn't it give the same answer?

wide spear
#

No it doesn't give the same answer as Euler's method

tribal pulsar
#

How does the green part works?

#

I search around the internet and find "finite difference approximation" but how?

prime kraken
#

you can read about it on almost any google search about finite difference approx :p

#

In numerical analysis, finite-difference methods (FDM) are a class of numerical techniques for solving differential equations by approximating derivatives with finite differences. Both the spatial domain and time interval (if applicable) are discretized, or broken into a finite number of steps, and the value of the solution at these discrete poi...

fathom rain
# tribal pulsar I search around the internet and find "finite difference approximation" but how?

Intuitively think maybe like this: f'(x)=df/dx Here, df is a small difference in f and dx is a small difference in x. So, df=f(x1)-f(x2) where x1 and x2 are close, and then dx=x1-x2. f'(x)=df/dx is only true as the difference goes to zero and for finite differences you only have that f'(x) is approximated by df/dx

In your case you are interested in f'(x) for x=p_{n-1} and you have that p_{n-2} is a point close to p_{n-1}. Hence, just plug in your values to approximate the derivative f'(p_{n-1}), that is df=f(p_{n-1})-f(p_{n-2}) and dx=p_{n-1}-p_{n-2}

tribal pulsar
sleek sundial
#

A question for seasoned data scientists. Let’s say we have a dataset and we suspect it has a lower-dimensional underlying structure. Obviously, we’d like to work with fewer dimensions. But all dimensionality reduction algorithms I know of presuppose the structure of the manifold: PCA thinks of the data as an ellipsoid; Isomap only works if you happened to correctly guess the topology you embed the data into, considering the usual guess is a flat plane, it isn’t very useful most of the time; t-SNE thinks of the data as a collection of globular clusters, and tries to fit anything into that framework; and so on. I’m curious, is there an algorithm that can learn the underlying manifold directly? As an atlas? Or, even better, the algorithm to extract not only the topology of the data space, but also its curvature in every point?

wide spear
#

UMAP?

sleek sundial
#

Isn’t it just a slightly better version of t-SNE?

wide spear
#

Based on the paper, it seems like they aren't assuming anything about the underlying manifold

civic ledge
#

anyone knows about the redlich-kwong equation of state
i wanted to know if that eqn could be solved using numerical methods?

fathom rain
civic ledge
#

ok can this be used for doing mini projects?

#

this equation

#

for finding something

fathom rain
civic ledge
#

oohk

#

could u just go through it

#

and lmk if possible

fathom rain
toxic wagon
#

hello! i have asked a question in #help-14 regarding finite difference approximation. im not sure how to tackle this problem.

wide spear
#

Can you repost your question here and close the question channel

toxic wagon
#

Hello! I currently have a problem with using finite difference approximation for solving the eigenvalue problem of a periodic boundary condition. Right now I only have the second-order forward and backward approximations for f'(x) and central approximation for f''(x). From there, I don't know where to proceed. As far as I know I should set up a banded symmetric matrix A?

The eigenvalue problem is as follows:
y''(x) = -lambda y
y(0) = y(1)
y'(0) = y'(1)

#

here it is @wide spear

#

from what I know about FDA, i'm supposed to have some equation that looks like
$$
\begin{bmatrix} 2 & 1 & 0 & 0 \ 1 & 2 & 1 & 0 \ 0 & 1 & 2 & 1 \ 0 & 0 & 1 & 2 \end{bmatrix}
\begin{bmatrix} y_1 \ y_2 \ y_3 \ y_4 \end{bmatrix} = -\lambda
\begin{bmatrix} y_1 \ y_2 \ y_3 \ y_4 \end{bmatrix}
$$

wide spear
#

Do you want to fix that matrix

toxic wagon
#

$$
\begin{bmatrix} 2 & 1 & 0 & 0 \ 1 & 2 & 1 & 0 \ 0 & 1 & 2 & 1 \ 0 & 0 & 1 & 2 \end{bmatrix}
\begin{bmatrix} y_1 \ y_2 \ y_3 \ y_4 \end{bmatrix} = -\lambda
\begin{bmatrix} y_1 \ y_2 \ y_3 \ y_4 \end{bmatrix}
$$

pine jettyBOT
#

gristCollector

toxic wagon
#

there we go lol

#

thats what i believe the matrices look like for dirichlet boundaries

#

but im not sure how this is going to look like in periodic boundaries

wide spear
#

With periodic boundary conditions you'll get a 1 in the top right and bottom left corners of the matrix

#

The [1 2 1] wraps around, in a sense

#

Also don't some of those need to be negative

#

Shouldn't it be 1 -2 1

toxic wagon
#

oh right, because $y(0) = y(1)$, which means $y_1 = y_4$

pine jettyBOT
#

gristCollector

toxic wagon
#

oh right, should it?

#

ah yeah it is whoops

#

it is supposed to be -2

#

but is that the only change to this matrix?

wide spear
#

Yes, that's the only change

toxic wagon
#

ohhhh

#

ok that makes a lot of sense actually

#

and now i just use the same methods as i use in dirichlet to calculate for the eigenvalues?

wide spear
#

Sure

toxic wagon
#

i see, thank you very much!

fathom rain
toxic wagon
fathom rain
# toxic wagon may i ask why this is the case? our lectures just passed over this tbh and i did...

Because circulant matrices form a matrix algebra that can be diagonalized by, for example, the Fourier matrix, and in that case the eigenvalues are given by sampling the symbol (function f here, which generates the matrices (the elements of the matrix are given by the Fourier coefficients of f)), by for example that grid t_j. So, all eigenvalues of all circulant matrices are given in the same way.

toxic wagon
#

i see

#

im still trying to understand the last bit about fourier coefficients but i think i get the gist of it

fathom rain
#

If you have Dirichle BC instead the grid is just t_j=j*pi/(n+1) instead (same symbol f)

#

f(t)=-2+2cos(t)=exp(-i*t)-2+exp(i*t) then Fourielr coefficients are f_{-1}=1 f_0=-2 and f_1=1 and that is the elements on the upper diagonal, main diagonal, and lower diagonal of the matrix (and in the circulant case for periodic BC you just add 1 oin top right, bottom left corners)

#

Dirichlet -> $T_n(f)$ and Periodic -> $C_n(f)$ and Neumann -> $T_n(f)+R_n$ where $T_n$ is a generated Toeplitz matrix $C_n$ is a circulant and $R_n$ is a low-rank matrix (only 1 in top left and bottom right corners)

pine jettyBOT
#

Sven-Erik

fathom rain
#

(they all share the same symbol, eigenvalues are just given by different grids)

wide spear
#

-Âż

toxic wagon
#

ok i think i understand it

fathom rain
#

In engineering the matrices that diagonalizes these matrices (when combining different BC) are usually denoted dct-* and dst-* (discrete cosine and discrete sine transformes where *=1,2,3...)

tall solar
#

Anyone here know about proximal operators

brave crypt
#

any help with how to get started on this stochiastic calc problem

wide spear
brave crypt
#

i'll post it there

obsidian oracle
#

Hello everybody.
anyone has the demonstration or a paper that discusses the rate of convergence of stochastic gradient descent when the gradient is L-Lipschitz continues? I can't find it in any paper

obsidian oracle
prime kraken
#

there are several papers on related topics though, just look it up on google scholar

obsidian oracle
prime kraken
#

👍

neon snow
#

Hey! I want to model a spiral transfer with a low thrust engine from an orbit of 2000 km to 40 000 km. Any advice on how to model this? I'm thinking express the equation of motions in cylindrical coordinates:
[
\frac{d^{2}r}{dt^2}-r\left(\frac{d\theta}{dt}\right)^2 = -G\frac{M}{r^2}
]
for the radial part, and for the tangential part:
[
r\frac{d^{2}\theta}{dt^2}+2\frac{dr}{dt}\frac{d\theta}{dt}=\frac{F}{m}
]
where (F) is the force of the engine then. I suppose I could just numerically solve this in MATLAB? How can I rewrite this system in a form that I can feed to a numerical method, like \texttt{ode45} in MATLAB?

pine jettyBOT
#

Victor H

wide spear
#

To feed this into a pre-built solver, rewrite this as a system of 1st order equations

neon snow
#

Yeah, I'm struggling a bit with how haha, I totally forgot lol sorry.

#

How would I start off? If I remember correctly you usually express the second derivative by introducing an extra variable, and use that as an intermediate variable... I think

wide spear
#

Write $x=\frac{dr}{dt}$ and $y=\frac{d\theta}{dt}$ so your differential equations become [\frac{dx}{dt}-ry^2=-G\frac{m}{r^2}] and [r\frac{dy}{dt}+2xy=\frac{F}{m}] and [y=\frac{d\theta}{dt}] and [x=\frac{dr}{dt}] so you have [\frac{d}{dt}\begin{bmatrix}\theta\r\x\y\end{bmatrix}=\begin{bmatrix}y\x\ry^2-\frac{Gm}{r^2}\\frac1r\qty(\frac{F}{m}-2xy)\end{bmatrix}]

pine jettyBOT
#

é™†æ™Żć’Œ

neon snow
#

You're such a fking chad

#

Thanks mate, I really appreciate it

#

I'll try to fix this, and inevitably fail and probably ask again. But damn bro, thanks a whole bunch

radiant vortex
#

I did it like this. Can someone verify if its okay?

orchid sequoia
#

it seems ok ig

wide spear
marble kestrel
#

if anybody here knows anything about ML or something similar and is willing to help, id really appreciate it. pls ping me if you do.

so i have the following question i need help with. im pretty sure the weights matrix has dimension 2 (its a square 2 by 2 matrix) and the biases are also two dimensional.

im pretty confused on part (b) though

wide spear
#

Chain rule

marble kestrel
# wide spear Chain rule

no, no, like. i dont know how all of these things are supposed to come together. i can do chain rule just fine, i just dont even know what to apply it to

#

to get from the input layer to the middle layer

pine jettyBOT
#

c squared

#

c squared

#

c squared

marble kestrel
#

is this on the right track?

wide spear
#

Sure

marble kestrel
wide spear
#

Yeah

marble kestrel
#

sick. so happy i was doing that right. thanks

wise yoke
#

wait so like

#

optimisation of the svm hard margin thing

#

they say set b = -1 and seem to negate the t and x

#

i thought it would be optimised by setting to b = 1 as the constraint no?

#

confused why they selected the A matrix as each instance of -t^ix^i and not just postive

fresh oxide
#

guys, how would i go about finding eigenvector of a matrix with eigenvalue 1?
in code?

mint hemlock
fresh oxide
#

but power method wont get me as many marks 😂 precisely cuz of the time complexity

#

ill use both tho

#

thnx

#

im basically trying to find steady state vector for a markov chain

mint hemlock
#

Yeah, doing (A-I)x=0 is only funky because A-I is singular

#

hm yeah, do you have an expected time complexity?

fresh oxide
mint hemlock
#

So given a method that has a region of absolute stability D (think about some methods’ stability regions here such as Euler’s), as you allow h->0, in order for your method to converge, does h\lambda_k need to be in this region D. In other words, can there be an eigenvalue with h\lambda_k outside of D without the method diverging (hint: this criterion is a part of what makes a method conditionally stable, think again about Euler’s method)

fresh oxide
#

just the power method will be considered comparitively inelegant to (A-I)x = 0 i guess

#

wait, u are saying power method is faster than finding eigenvector using (A-I)x = 0?

mint hemlock
#

Well it depends on if 1 is a maximal eigenvalue

fresh oxide
#

i have already implmented power method here i guess

mint hemlock
#

doing (A-I)x=0 would require Gaussian Elimination of A-I I would think

mint hemlock
#

mhm

fresh oxide
#

power method for a large matrix is..

mint hemlock
#

Isn’t that just doing Ax for multiple x? so that’s n^2

#

then some number of iterations

fresh oxide
#

complexity of dot product is O(n^3)

#

right?

#

tho i guess if its a vector with matrix it could be O(n^2), not sure

mint hemlock
#

Nonetheless neither method is fast

#

Is there anything known about A?

#

or any property it possesses

fresh oxide
#

yeah its o(n^2)

#

with vector

#

and convergence is linear

mint hemlock
#

indeed

fresh oxide
#

idk what to make of that tbh

#

thats why these numerical methods are kind of tricky

#

how fast would a large matrix converge?

#

eh, i might just do the power method cuz im probably gonna run out of time

mint hemlock
#

yeah rate of convergence is |\lambda_1/\lambda_2|, so you would need to know how large 1 is in comparison to the next largest eigenvalue

fresh oxide
#

its a markov chain

#

so.. i think

#

wait lemme just use eig

#

these are the eignvals in this case

#

this is just a dummy version tho

#

actual matrix is something like this

mint hemlock
#

ah yeah, so power method should converge quite nicely as long as you have similar ratios to that

fresh oxide
#

each row adds up to one