#numerical-analysis
1 messages · Page 25 of 1
i dont know how to convert this to dual problem
how do they reach the end there
im already lost here
since here the x_T so on are complex so first idk what it means by inf, we were only taught how to do dual problems for real functions 😦
do you take Re (<v^t, x_T+x_J+Bz>)?
and also can v be a complex vector
i believe in the end i need to basically formulate it as argmax_v g(v) over a feasible region for v which idk how to find
they're not using a lagrangian there
also, you're gonna need wirtinger calculus to do a minimization including the norm of complex variables, since that isn't analytic
The dual of a given linear program (LP) is another LP that is derived from the original (the primal) LP in the following schematic way:
Each variable in the primal LP becomes a constraint in the dual LP;
Each constraint in the primal LP becomes a variable in the dual LP;
The objective direction is inversed – maximum in the primal becomes minimu...
i wouldn't say this stuff is especially difficult, but you seem to have lots of gaps, like you've never seen this before
why are you working on this

ripticipation
i got some research w/ a prof but yea, they just threw me this paper 
well not prof but like
some phd adjunct prof
sounds like a douche, they could at least explain the basics of the problem setup
they do a bunch of bullshit just to arrive at Ax = b, solve plz
most of the paper is "what's the structure of A?"
its ok im busy the past weeks anyways, they did ask me to ask them if i need help
lol
oh well

Isn't this chat too broad?
Well, also the #point-set-topology
Like some chats are for just one branch and then these 2 chats occupy many things
So?
It would get split up if it were more active
It isn't more active so it isn't split up
Oh, it's because activity
If I have the function $f(x) = \vert x \vert$, how do I find the error term if I use interpolation at $x = -2, -1, 0, 1, 2$.
werwerwercKoshka
Using what basis?
Sounds like polynomial interpolation I guess?
Yeah polynomial interpolation
seems kind of hard since the function is not differentiable, but thats just me not knowing enough
cuz normally u have better explicit error bounds only when stuff are C^1, C^2, ...
actually not even better since f^(n)(x) can be very nasty
largange interpolation sucks
ok but what you can definitely do is just get your interpolating polynomial using lagrange or newton coeff and then compute sup(|x|-P_4(x)) in your case, over whatever interval u want to know the error in
where P_4(x) is the interpolating polynomial of degree 4 since ur using 5 points
You can explicitly calculate the maximum error though
Yea
where to start if I want to write a formula from this?
Aren't there formulae for the a_k and b_k in a Fourier series expansion?
yes there is
I believe I am trying to create a formula and make it similar to a_k and b_k
this is the whole question
I don't know the convention you're using but I think I remember the Fourier coefficients being given by an integral
Prudence
and you can write the complex exponential in terms of sine and cosine
i don't understand what the question is, haven't they already given you all the info you need?
this is also a DFT, not an FT
Oh I've not seen the DFT before
Can I be cheeky and ask if the FT with a counting measure is the DFT?
i know squat about measures so i couldn't say
@prime kraken you said that they already given all the info? So how do I derive a formula from this? do i change it to $f(t) = a_0 / 2 + \sum^{\infty}_{n = 0} (a_n \cos{\frac{2n\pi t}{T}} + b_n \sin{\frac{2n\pi t}{T}})$?
Hawt
what do you mean by derive a formula?
Write a formula that relates the complex Fourier coecients computed by your t package
to the real Fourier coecients, $a_k$ and $b_k$, that define $P_N(x).$
Hawt
Compile Error! Click the
reaction for more information.
(You may edit your message to recompile.)
they have told you to first find the a_k and b_k given a periodic array with entries f_j. then use those a_k and b_k to get your trig poly P_N(x)
what is this "t package"
oh fft package, you had a glitch with the ff
fft package*
well the fft is just a fast implementation of the DFT
so if you look up the DFT, you'll see that they compute the coefficients a_k and b_k at the same time doing the same as in your formulas for a_k and b_k, but instead of using sines and cosines, they use a complex exponential
using euler's formula, these complex exponentials are also sines and cosines as desired
the only thing you might have to be careful with is an extra factor 1/2 depending on if you want negative frequencies or not
you can derive this by noting that you can write cosine in terms of complex exponentials, and same with sine
and just carry those complex exponentials all the way through when you substitute them into the sums
then you note that you can split this sum from 0 to N-1 into a sum with negative and positive indices, symmetric around 0
so i am suppose to take a_k and b_k and combine them to get $c_k = \sum^{N-1}_{j=0} f_j e^{-i2\pi kj/N}$ ?
Hawt
Compile Error! Click the
reaction for more information.
(You may edit your message to recompile.)
yes
anyone understands DFT?
Sort of, It's more like dft is ft on Z/n, and then counting measure comes naturally because it's the "correct" (Haar) measure on Z/n
In general you can do Fourier transform on any locally compact group, and this covers normal ft, Fourier series, dft
The group you start with (R for ft, R/Z for Fourier series, Z/n for dft) ends up determining the Fourier transform domain (R again for ft, Z for Fourier coefficients, Z/n again for dft)
For question 3, how do i take the derivative of the coefficients of $P_8(x)$ to get $P'_8(x)$?
Hawt
Did you use the formula in part 2 to get P_8(x) already?
Do you know how to compute the derivative of sine and cosine
That's all that's going on
thanks. i'll transfer my question there
Quick question in constraint optimization if the inequality constraint multiplier is nonzero then clearly the constraint must be active. But does an active constraint have to have nonzero multiplier? And why
bruh google is your friend
google the proof of complementary slackness
has to do with duality
.. i searched google and found a_n = 0 doesn't mean constraint is inactive, though it is most likely that is the case. I want to make sure though
and seems like that happens when the objective function has an minima that lies on the surface of the constraint function
well yeah, you have something like ab= 0
at least one of them is 0, but also both of them can be
we're also talking about feasible points btw
yea, thanks
I feel like this is a dumb question but I'm going to ask anyways 
heres some context, being asked to experimentally determine if the theoretical complexity of binary search is right
im curious why no matter what base i pick, i always start to get data that runs ahead of the common difference
i dont have a fast enough computer to see if it levels out
i'm not sure i followed exactly what is going on, but if your array size is fairly small, everything probably fits into cache, whether it is L1/L2/L3. naturally if your array size increases, you will face issues of your array not fitting into cache, and runtimes will increase disproportiontely.
but i'm also not an expert at computer architecture, so who knows. there are a lot of weird little details that can affect the runtime of your code
eg if your array size is small, then eg when you fill your array, you will have fetched all cache lines, so they will be in L1 or L2 or whatever. then fetching them again is fast. as opposed to if your array size is big, and they get evicted to say RAM
oh, interesting
I guess i have to write more to just run a bunch while i sleep
maybe i can avoid that effect if i let each time through take a long time
unless you mean i'm seeing the line right there
I guess theres not really a good way to know
how many times are you running it? it says k=6 took 0.4 (seconds?) then if you are not running it too many times (say, 100 times per k) i would imagine it doesn't take that long, right?
I don't know exactly I have to examine it more
its provided by a teacher
I think it's doing its own average of 1000
I did my own average of 5 but just manually since I didn't feel like writing the code to automate it
but I'll tackle that tomorrow
I was curious if there was a strictly like, theoretical reason for that uptick I keep seeing, but maybe thats optimistic or naive
at least as far as i'm seeing it, i don't think this will affect it
What language are you using, and what is the datatype of the key? You can include some accumulators for the data structure to keep track of the number of times certain operations occur (key compare, child lookup), which should perfectly mirror it. If it doesn't, the implementation is off. Sistent nailed a lot of it with the cache considerations, but you very well could also hit magic numbers that even above that cause poor performance/cache thrashing.
Something else is a tad off, the more I started to think about this. What kind of computer are you using here? I have a nice one, but through a VM, running python, with a list of 1 million, bsearch completes in 0.0035 seconds. Are those runtimes in seconds?
Even moreso, the more I hacked around here, the only real stress and time is allocating the list so that I can actually time the binary search. With 10GB available to the VM, I can only do 10^8, the setup to fill the list takes a long time, but the actual binary search is still way, way less than your computer is taking. It really shouldn't take .4 seconds
Yeah, it's taking about 0.05 seconds for me to run a binary search on 10^8 keys, taking about 8GB of ram (not efficiently). I have a fast computer, but not like, 10x faster than a typical computer. Something is up with your setup, code, or the way it's being timed.
Not an expert, but theoretical complexity and real implementation can somewhat seem different at times
As you have people talking about L1/2/3 caches here and so on - those might not be part of your original complexity model
won't you anyways run into issues with the cpu scheduler?
Not in an appreciable amount, that should be a scalar on the existing time. A bsearch is going to be in a single thread, whatever else is happening on the system, is happening on the system in equal amounts. If you can only get xx% of cpu time, that's true for any amount of real work. There's a lot of context switches under the hood, at the point of .1 seconds, you're already hitting tens of thousands
That is to say, unless your system is already pegged at 100% utilization, if you hit y context switches in 0.1 seconds, you should expect to hit 2y context switches in 0.2 seconds, for roughly the same effect, which is why you repeat the test several times to cut down on noise
surely, but it'll never reach the exact value, especially if done in big O notation
though one would expect it to become more and more accurate with larger inputs, which is not what we see there
hmm
I think it doesn't make sense to try and isolate some of the black magic that is the modern CPU, until we're sure the algorithm is properly implemented. 0.06 is a long time, like, a long time for a simple search
I think moreso than the cache, depending on the amount of RAM OP has, if it's getting large enough to hit the pagefile, that's going to really be hitting it hard
I also don't want to rule out that the keys are some variable length based on the input size. Like, string comparisons aren't constant time, if the size of the keys are increasing, and the time required to compare keys increases with it, that could also be a hit
that's some good insight
oh i missed all these replies
sorry yall
The code might just be poorly written, but it is doing an average of 100 trials
oops sorry i meant to turn off the ping there
ive just been grabbing the total time, so those numbers are of 100 trials
my computer is pretty slow too
I don't know enough to analyze exactly how much ram I have available for this code but I have very little ram in this computer so i wouldnt be surprised it were hitting some limit, if that were the case i'd think it'd suddenly spike instead of slowly accumulate, but i dont really understand computers
also honestly i may just be dumb and not read the code close enough because i think its made to be slow 👀
// search runs too fast, so this first code is just to slow things down...
for (int i = 0; i < 100000; i++) {
int j = 2 * i;
}
...```
although i guess that just slows it down, its not gonna accumulate there
"search runs too fast"? I'm not sure what this is supposed to mean
idk removing that line makes it hard to not get a 0
idk ill just start playing with something other than this given code which im not sure is helping
thanks for the input yall
Nah, that's a fine thing to do, to simulate work. There's still a lot of gotchas, I'm guessing you're running this in a debug config, vs release with like -02 optimizations
At a certain point of consuming RAM, the OS will recognize there is memory pressure. Either all physical RAM is committed, or it's being proactive to prevent all RAM from being committed. What it does, is take a "page" of RAM, and saves it to disk, and removes it from RAM. A page is like, 4kb, but OS and setting specific. If you have 4GB free, and you use 4.5GB, well, .5GB goes to disk. If you commit that RAM, but then never use it, reference it, look at it, let it play any reindeer games, it won't slow you down at all
But, if you have a random access pattern, it will proportionally¹ slow down the system, as each time that page gets a memory hit, it takes a while to pull it off disk and back into RAM
So like, there's never really a wall you hit, where using a 4GB allocation takes 10 seconds, but using a 4.5GB takes 20, just that the first 4GB might take 2.5 seconds for each GB, then each GB after that adds 3.5 seconds. Totally made up numbers
1: The CPU and mobo seriously do some black magic. If you hit address 0x80001000, then 0x80002000, then 0x80003000, it doesn't matter if 0x80004000 is in RAM or not, if your access pattern is predictable, the CPU will request 0x80004000 be loaded back into RAM and even into cache before you even think about using that address, because it saw it coming. There's a lot that goes into it, like, if you're just doing some basic accumulator, you'll still be limited by the bandwidth of loading from disk, but if your bottleneck is CPU bound, like, having that loop you have there, there's a chance the swapfile will literally never impact performance. It all depends, though
Can anyone help me check my proof? @knotty siren
Let $\alpha, \beta \in K$ and $p_1, p_2 \in R = K[x]$. \
Then $\Phi(\alpha p_1 + \beta p_2) = \lambda$ \
$= (\alpha p_1 + \beta p_2) (\lambda)$\
$= (\alpha p_1 (\lambda) + \beta p_2 (\lambda))$\
$= \alpha \Phi (p_1)(\lambda) + \beta \Phi (p_2)(\lambda)$\
This shows that $\Phi$ is a linear transformation.\
Now, $\ker \Phi = {\Phi (p)(\lambda) = 0: \forall p(\lambda) \in R} = {p(\lambda) = 0: \forall p(\lambda)\in R}$\
If $p(\lambda)$ is a nonzero polynomial and satisfy $p(\lambda) = 0$ for all $\lambda \in K$, then $p(\lambda) \equiv 0$. Therefore $\ker \Phi = {0}$\
Since $\ker \Phi = {0}$, the function $\Phi$is injective.
Hawt
This is not the right channel for this
Additionally, please wait 15 minutes before pinging helpers
read #❓how-to-get-help , read what the post above says as well ^
@rustic ocean thank you i didn't know the questions part will also help out with advanced math.
np
what is the Moreau Decomposition and why is it useful? It's just stated more or less as a fact in my lecture in the chapter about the proximal gradient method without much information
this is weird but
what's the difference between the jacobi and kaczmarz methods?
they seem to have different convergence conditions, but it seems possible to write both of them in the same way
right?
you can write both of them as an update of the form x_k+1 = x_k - D (b - Ax_k) with some diagonal mat that has elements 1 / (some norm squared)
hmmmmm
kaczmarz puts the row norms in that D matrix
jacobi puts the column norms, but its for symmetric matrices, so it's the same as the col norms
jacobi special case of kaczmarz?
weirdly enough it seems kaczmarz converges under kinda weak conditions
is kaczmarz u refer to iterative projection onto the rows of A (hyperplanes)?
yeah
yea I don't know much, but I just remember the kaczmarz convergence is in expectation if you are randomly choosing the rows of A
to project onto
i was thinking of the deterministic one, but yeah, i have seen that
ah
Hey guys I have a quick question: I want to start my masters in applied mathematics and to prepare during my bachelors (in my last year currently studing math in general) I would like to know which physics curses you would recommend me to take? So basically do applied mathematicians use classical mechanics, electromagnetics, general relativity and or quantum mechanics or even field theory? I am guessing you can specialize in each one of them but to have good basis as a mathematician in an applied math master which would you recommend most? Thanks in advance :)
I've not taken a single physics class and I'm start my phd so ~~
If you ask dami, he will say, "physics is for nerds"
How can I decouple a time delayed 2x2 transfer matrix that has a singular static gain matrix?
Can someone explain to me how to solve stiff differntial equation with splitting?
Can I just split off the stiff part? Does it help when I do that?
Yes operator splitting is one way of handling stiff differential equations
The idea is to compute solutions for the two parts separately and then combine them in a good way
You might be interested in reading about Strang splitting
Strang originally devised this splitting technique for systems of ODEs that have varying time scales like chemical reactions, which are a classic source of stiff equations
Maybe this is a good brief intro
With lots of sources for further reading
In the literature you may also see this referred to as an implicit-explicit method
You use a cheap/easy to compute explicit method like RK4 for the non-stiff part
And you use a more expensive/more difficult to compute implicit method like GL2 for the stiff part
Thank you very much :D
So this means if I have an equation like this $\dot y = f + g$ with a normal part $f$ and a stiff part $g$, then I can split it into $f$ and $g$ and combine them with strang splitting?
Rassilon
Yes
Thank you :D
There's a bit more specifics about how you actually perform the splitting
But you can read the details on your own
hullo
Do you have a question
Does anyone know cpp? and doesn't mind reviewing some of my code? it's mathematical programming at its essence. thanks
People don't really want to commit to reviewing code a priori, you can just post it and someone might comment
Coding stuff goes in #computing-software generally
This channel is more conceptually stuff as opposed to implementation stuff
I just noticed an elementary problem that I cannot find a way to resolve rn.
So, given a C^2 function, is there a way to reasonably approximate the second derivative through finite differences by successive application of first derivative through the same finite difference scheme?
1 ) it seems the error becomes too big to handle once you use central differences twice, because you get the second derivative central differences but with half the mesh size
- another way of getting the central scheme for the second derivative is to apply forward and then backwards difference, but if Im trying to code specific differential operators and trying to reapply them when needed. should I have to pick and choose those different ones when needed?
I also dont think there is any central difference scheme that can be reapplied and keep the mesh size established, so what is the standard solution? should I use 2)?
tag me if you have an idea
1 is fine if you're using a higher order scheme
you mean applying 4th order twice, for instance?
Sure
Or
Get two first order derivatives using a higher order scheme
Then use a two point centered difference for the second derivative
idk if this goes in here or #advanced-number-theory but
If I have a sublattice of Z^d for some large d, and I know this sublattice has dimension n (with n < d)
and I have n vectors in this sublattice that I know are linearly independent
then is there a way to get a basis of this sublattice from that?
Zoph I don't think you can
Let's say you have Z^d and you have n vectors of the form (pm1,....)
Then you can't tell what this sublattice was generated by
@median plinth
beware that applying first forward then backwards is not equal to standard Dirichlet Laplace central difference, it imposes Dirichlet and Neumann BC. If you want to multiply discrete operators you have to manually correct corners of the matrices/impose correct BC. To avoid this either work with ghost points, or use the spectral symbol and generate the appropriate matrices from it (and then impose BC) for example f_forward(t)=-1+exp(-it) and f_backwards(t)=-1+exp(it) and f_forward*f_backwards=f_laplace=-2+2cos(t) bilaplace is just f_laplace(t)^2 which is (-2+2cos(t))^2=-6+8cos(t)-2cos(2t) this works for all discretizations of PDE (higher order FDM,FEM, FVM, DGM, IGA etc) and also FDE
(also, if you use the spectral symbol to define your operators before actually creating them, then everything automatically translates to multi-d)
Oh yes duh, cause different lattices can contain those vectors that was a dumb question
I guess what I meant to ask is that

If I have a subspace of C^d, how do I find generators for the lattice of all integer vectors in this subspace
If I have n linearly independent integer vectors in this subspace
Yeah the subspace and so the lattice are both n dimensional
Integer vector as in all of the entries are integer
So the lattice is every integer vectors
Or is it vectors of all integers
Integers being Z[i] I assume?
No integers like Z
But your lattice is in C^d?
I'm not sure what the issue is?
Don't you run into the same issue of different lattices having overlapping points
I don't think so? The lattice is specified here. It's the integer sublattice inside of the C-span of the n vectors
But your n vectors don't generate the desired lattice?
No they might not
I don't see why orthogonality really matters here? If that's what you mean by GS-like
Or it's like
Start with a 1-d sublattice
Figure out a generator
Iterate on the dimension
Yeah I feel like this iteration is hard
Ok let's say you have a sub-sub-lattice with k generators
You want to find the k+1th generator
So you add the k+1th dimension from the original lattice into it
And then you have the k+1th independent vector that you start with
And you can compute what lattice you get if you treat this k+1th independent vector as a generator
And you can compare it with the actual one
And then see which points are missed
And then perform the appropriate scaling
Does this make sense to do
Maybe it doesn't make any sense to do
Yeah I don't get how you can compare it to the actual one to see which points are missed
Aren't you given the actual latice
Oh I see
You can only check if vectors are in the lattice
But you can't check if the lattice contains vectors that you don't have?
Yeah I don't see how you could check such a thing
Ok
Ummm
You know that every vector is an integer vector
So you start with the independent vector from the list
Compute the gcd of the elements
Divide by that
Check if it is in the lattice
If not mult by 2
Check
Etc..
Dividing by the gcd will always be in the lattice
So there's no need to check
But this still doesn't work, the example I've been thinking about is
(1,-1) and (1,1)
The integer lattice in the span of these two vectors is the usual Z^2
I mean that
I look at the C (or equivalently R)-span of these two vectors
And look for all the integer vectors inside of that span
Isn't your lattice all of R^n then
What? The lattice is the integer vectors inside of the span
Right
But if you have n lin indep vectors
And then you consider the C/R span of them
You get C^n/R^n
And then your lattice is all of the integer vectors
I do not understand
Mm how should I say this
Consider the same example of (1,-1) and (1,1)
Abstractly the R span of these is isomorphic to R^2
But the integer vectors are not spanned by (1,-1) and (1,1) if that makes sense
In this case, it is all of the integer vectors
But if I were to consider like
(1,-1,0) and (1,1,0) then I want (a basis for) all integer vectors with 0 last coordinate
Ok so the standard basis for whatever subspace
Idk what you mean by standard here
Just (1,0,0), (0,1,0)
I mean, if the subspace is spanned by like (1,0,0) and (0,1,1) then you don't have a standard basis
Wait ok so why is dividing each vector through by the gcd of its elements not good
The previous example of (1,-1) and (1,1) shows that this doesn't work right?
What about row reduction
hm yeah that might work
At the very least it works for (1,-1),(1,1) and (1,0,0),(0,1,1)
yeah I feel like this does work but have no clue how to prove it
Induction
that makes sense
thank you
ok yea I'm pretty sure that this works now, thanks for the help
Lol I'm sorry I took so long to understand what you were asking
I have have a problem with the exponential Euler Method
I should solve this differential equation
Rassilon
Using this method:
Since the differential equation is not autonomous, I am autonomising it, this is what I would get for the right hand side.
When I calculate the jacobian of it, I get this
Rassilon
Which contains the inverse of the jacobian
But because the last row and collum of the jacobian is zero, it is invertible
Does anyone know how to do this?
Sorry, I meant $\varphi$
Rassilon
Does anyone here know any good resources for learning the basics of computational maths with python, regarding DEs, PDEs, dynamical systems etc.?
Why do you have t in there
A good course (but julia instead of python) https://computationalthinking.mit.edu/Spring21/
thanks, may have to translate julia code to python
Because the cos(wt) depends on time
And the method is for time independent differential equations
So I would to somehow make it time interpenedant, right?
not sure if this is better for in here or the algebra channel, but just finished a first course... I keep hearing that this all is used in some random engineering/ biology and was interested in that. Then, came across this in the book, which allows for a bit more of a concrete q
where can i read more ab this stuff? simple google searches have not yielded what i have in mind
lol i mean just found this from a google search for example http://math.uchicago.edu/~may/REU2012/REUPapers/LiuH.pdf
Perhaps too technical, but iirc there's a link to Hamming code for example
ah
maybe i was too general
i was just searching like
"computational algebra" lmao
Signal processing is stuff about Fourier transforms and stuff
yea
ECC is as mentioned by nuclearpotat
Crypto is stuff about elliptic curves
ECC can mean both error correcting codes and elliptic curve cryptography, interestingly enough
You might also be interested in An Invitation to Nonlinear Algebra by Bernd Sturmfels
thanks
that looks to be about what i had in mind
the sort of math that they wont teach a course on bc smaller school and too niche but that they fill the drp projects with
is this the right channel to ask on about operations research, linear programming in particular?
Sure
I'm trying to find the zero of a function in an interval [a,b] where a is extremely small (1e-6). So far I've been using SciPy's bisect and it's been giving satisfactory results. I wanted to speed up my computations so I tried using brentq or brenth instead, unfortunately these are failing in many cases. In particular, they return 0 as a solution quite frequently, which is out of bounds. According to this SO answer, this is something that is known to happen. https://stackoverflow.com/a/32815270/5433929
Anything you guys suggest I use that would solve my issue, namely something that:
1/ Converges faster than bisect
2/ Works for finding a solution in a bounded interval
3/ Is guaranteed to return a solution that is actually in the interval
I should add that my function is not quite well behaved at the edges of the interval
Use a high precision root finder. For example Roots.jl + BigFloat in Julia
I have had issues with the Newton method testing values that are out of bounds where the function is not defined
The simulation software I’m working on is in Python.
I assume there is the equivalent thing in Python using mpmath?
It looks like mpmath has a root finding algorithm but it doesn’t allow for a search within prescribed bounds
seems to be some solvers with intervals here https://mpmath.org/doc/current/calculus/optimization.html
what kind of problem are you solving? 🙂
Looking for the roots of these kind of functions
where phi is the standard normal PDF, and Phi is the standard normal CDF
was it faster than bisect? seems like mpmath in python is pretty slow, never used it myself.
I've been trying to use mpmath just now and fo some reason it's not working. I have a function of one argument that does something, I did:
from mpmath import findroot
def func(x):
return something
findroot(func, (a,b), solver='illinois')
and it fails with this error:
File "C:\Users\User\AppData\Local\Programs\Python\Python39\lib\site-packages\mpmath\calculus\optimization.py", line 937, in findroot
fx = f(*x0)
TypeError: func() takes 1 positional argument but 2 were given
I have not idea what the reason for this is, I tried findroot before with a simple function like x -> x**3 and it worked perfectly well with this syntax
I can’t at the moment but will do in like 6 hours when I get home 👍
@warm otter Here's a minimum viable example:
from scipy.stats import norm
import numpy as np
from mpmath import findroot
#Some numbers defined outside of the scope of these functions
R = 0.5
K = 1000
sigma = 0.5
tau = 1
gamma = 0.995
m = 800
EPSILON = 1e-6
def quantilePrime(x):
return norm.pdf(norm.ppf(x))**-1
def getMarginalPrice(amount_in):
return gamma*K*norm.pdf(norm.ppf(1 - R - gamma*amount_in) - sigma*np.sqrt(tau))*quantilePrime(1 - R - gamma*amount_in)
def func(amount_in):
return getMarginalPrice(amount_in) - m
optimum = findroot(func, (EPSILON, 1 - R + EPSILON), solver='illinois')
this will return the error I'm getting
Traceback (most recent call last):
File "C:\Python39\lib\site-packages\mpmath\calculus\optimization.py", line 937, in findroot
fx = f(*x0)
TypeError: func() takes 1 positional argument but 2 were given
I guess this is the problem: TypeError: ufunc 'ndtri' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
I'm not sure I understand what this issue is 🤔
as far as I understand scipy.special.ndtri does not supprt mpmath data type? I got that error when I ran your code.
No idea about python, sorry.
np
Oh that’s actually brilliant, I don’t need the accuracy, just using this to have a robust solver implementation 👍
just testing this and it works, thanks a lot!
Hi guys, I've been having a lot of trouble with this, but basically what I'm trying to do is to optimise the prolongation part/restriction in a multi-grid scheme. To do that I use a stencil based algorithm: first I construct my differentiation matrix corresponding to my PDE (here it's Darcy's equation, with a field drawn from a log-normal distribution, Darcy's equation: $-\nabla \cdot (g\nabla u) = f$) then I use some kind of heuristic to compute the prolongation matrix. After doing so I use a heuristic to compute the prolongation matrix, when I get it I have to then construct the prolongation matrix from the stencil I just computed. The trouble I have is in constructing this matrix. I want to do that for a Dirichlet boundary condition.
Anarchy
What kind of discretization do you have, how many dimensions, what kind of multigrid? For an equispaced grid typically R=0.5P'
2D, geometric multigrid, equally spaced points like in the following screenshot:
I have only worked on algebraic multigrid (and not very much), sorry
alright, thanks anyways 🙂
Here's how to do in 2d with periodic boundary conditions
This notation confuses me. I use something called the spectral symbol and work with that (using Kronecker products etc), and then the materialization/generation of the actual matrices becomes trivial. It is kind of an advanced (and more accurate) local Fourier analysis
you literally uploaded a paper about it https://www.researchgate.net/publication/344638974_Momentary_Symbol_Spectral_Analysis_of_Structured_Matrices
🙂 here is another https://epubs.siam.org/doi/10.1137/18M1210460
thanks I'll read more about it !
I'm working on this btw: https://arxiv.org/pdf/1902.10248.pdf
I have one paper ongoing on parallel-in-time MGRIT discretization, where previous results are wrong (LFA can not properly handle non-symmetric matrices) But really, I know very little about multigrid, it is mostly collaborators that work on it
I see what you mean, as a consequence of that I use block fourier analysis
I'm not an expert on multigrid either I just happen to get this internship where I had to study this paper
thanks ! I have to read papers about it anyways, otherwise I have no way of explaining why we use it
https://arxiv.org/pdf/2003.05744.pdf
Here's an other paper about optimizing amg thanks to graph neural networks
If you have any specific questions, just send me an email and I can forward to somebody that can probably answer
Alright thanks ! That's really nice of you 🙂
I need help with vandermonde decomposition of toeplitz matrix. Below is ~2 paragraphs for context, though my question is pretty short
for context X here is fixed matrix which can be written as a sum as shown above. the point of confusion is why X is in the range of V
from wikipedia I saw that for any u in C^n, if toep(u) is rank k < n, then toep(u) = V* D V is the unique vandermonde decomposition where V in C^{nxk} is a vandermonde matrix
i.e. V can be written as [a(f_1),..., a(f_k)] in C^{nxk} for some f_1,...,f_k in [0,1), where a(f_i) = 1/n*[1, e^2pif_i, ..., e^2pi(n-1)f_i] for all 1 ≤ i ≤ k
unless Im not looking at the correct vandermonde decomposition, I don't see how X has to be in the range of V since the rank of V is at most k, and the rank of X could be r, where r > k can happen
maybe fits more in #linear-algebra but whatever
lol this is scuffed
d_i ≥ 0 but then they go on to invert D
ok i got this sorted, realized they implicitly meant the generalized inverse of D, and the proof overall works
Do you have some modern reference to this "Vandermonde decomposition" of Toeplitz matrices? When googling it seems to refer to a paper from 1911 something in the 70 and then Z Yang a few engineering papers? (I pretty much only do research on Toeplitz-like matrices and I have never heard of it, so I am intrigued, but suppose it is just a different name for some other standard definition?)
This paper is vague about vandermonde decomposition and I noticed also it’s not easy to find clear definitions
I referred to the one in the Wikipedia article here https://en.m.wikipedia.org/wiki/Toeplitz_matrix
In linear algebra, a Toeplitz matrix or diagonal-constant matrix, named after Otto Toeplitz, is a matrix in which each descending diagonal from left to right is constant. For instance, the following matrix is a Toeplitz matrix:
[
...
Which result is basically taken from yangs paper, but sorry I’m not expert and just skimmed that paper
seems to be a standard circulant decomposition at first glance
but not just for circulants
I have to read up on this, thnx 🙂
Oh i know one of the authors of the paper referenced in the wikipedia article (Stoica)
Anticipation
A longer toeplitz question, but self contained: Let $\begin{bmatrix} \text{Toep}(u) & X \\ X & W \succcurlyeq 0$ where $u \in \mathbb{C}^n$, $X \in \mathbb{C}^{n \times n} and $\text{Toep}(u)$ is assigning $u$ to first column and row to define a square toeplitz matrix. First fix a $u$ and so by vandermonde decomp theorem there is $V \in \mathbb{C}^{N \times L}$ where $L \leq N$ so that $\text{Toep}(u) = VDV^{\star}$ where $D$ is diagonal with positive entries. It seems by schur complement there is a $B \in \mathbb{C}^{N \times L}$ where rank(B) = rank(X) so that $X = VB$. The question is then if its true that there is a $E \in \mathbb{C}^{L \times L}$ so that $W = B^\star E B$. And if so, how to show it.
```Compilation error:```! Missing $ inserted.
<inserted text>
$
l.55 ...vandermonde decomp theorem there is $V \in
\mathbb{C}^{N \times L}$ ...
I've inserted a begin-math/end-math symbol since I think
you left one out. Proceed, with fingers crossed.```

A longer toeplitz question, but self contained: Let $\begin{bmatrix} \text{Toep}(u) & X \\ X & W \succcurlyeq 0$ where $u \in \mathbb{C}^n$, $X \in \mathbb{C}^{n \times n}$ and $\text{Toep}(u)$ is assigning $u$ to first column and row to define a square toeplitz matrix. First fix a $u$ and so by vandermonde decomp theorem there is $V \in \mathbb{C}^{N \times L}$ where $L \leq N$ so that $\text{Toep}(u) = VDV^{\star}$ where $D$ is diagonal with positive entries. It seems by schur complement there is a $B \in \mathbb{C}^{N \times L}$ where rank(B) = rank(X) so that $X = VB$. The question is then if its true that there is a $E \in \mathbb{C}^{L \times L}$ so that $W = B^\star E B$. And if so, how to show it.```
Mental World's Essence
A longer toeplitz question, but self contained: Let $\begin{bmatrix} \text{Toep}(u) & X \\ X & W \succcurlyeq 0$ where $u \in \mathbb{C}^n$, $X \in \mathbb{C}^{n \times n}$ and $\text{Toep}(u)$ is assigning $u$ to first column and row to define a square toeplitz matrix. First fix a $u$ and so by vandermonde decomp theorem there is $V \in \mathbb{C}^{N \times L}$ where $L \leq N$ so that $\text{Toep}(u) = VDV^{\star}$ where $D$ is diagonal with positive entries. It seems by schur complement there is a $B \in \mathbb{C}^{N \times L}$ where rank(B) = rank(X) so that $X = VB$. The question is then if its true that there is a $E \in \mathbb{C}^{L \times L}$ so that $W = B^\star E B$. And if so, how to show it.
```Compilation error:```! Missing } inserted.
<inserted text>
}
l.102 \end{document}
I've inserted something that you may have forgotten.
(See the <inserted text> above.)
With luck, this will get me unwedged. But if you
really didn't forget anything, try typing `2' now; then
my insertion and my current dilemma will both disappear.```
?????
.....
Please make it work
yea give me a moment lol, ill test locally and paste
A longer toeplitz question, but self contained: Let $\begin{bmatrix} \text{Toep}(u) & X \ X^\star & W\end{bmatrix} \succcurlyeq 0$ where $u \in \mathbb{C}^N$, $X \in \mathbb{C}^{N \times M} , W \in \mathbb{C}^{M \times M}$ and $\text{Toep}(u)$ is assigning $u$ to first column and row to define a square toeplitz matrix. First fix a $u$ and so by vandermonde decomp theorem there is $V \in \mathbb{C}^{N \times L}$ where $L \leq N$ so that $\text{Toep}(u) = VDV^{\star}$ where $D$ is diagonal with positive entries. It seems by schur complement there is a $B \in \mathbb{C}^{N \times L}$ where rank(B) = rank(X) so that $X = VB$. The question is then if its true that there is a $E \in \mathbb{C}^{L \times L}$ so that $W = B^\star E B$. And if so, how to show it.
Anticipation
So what I've got is at least if rank(B) = M, (i.e. B has full column rank), then yes for any W there is an E so that W = B*EB. The problem is I want to lift the restriction rank(B) = M. Somehow, I'm wondering if given the matrix X, defining u will restrict the rank of W somehow in order for the matrix in 2nd line to be PSD
and perhaps the restriction to W is always so that there's an E s.t. W = B*EB
oh E has to be PSD too
(V is a unit length vandermonde matrix, each column 1/N*[1, e^{2pif},..., e^{2pi(N-1)f}] for some f in [0,1))
sorry if this in the wrong channel but idk where else it could go, can anyone help me out with this problem?
the thing about this problem that i don't understand is that with other examples they're always defined as DP problems with a defined number of stages, whether it's a finite number of stages or an infinite number. here i have no idea whether there is only one stage to the problem or if there's an infinite amount
Hey can someone help with Matlab (plotting simple 1st order ODE using ode45)
Hey if you wanna do that, I'd advise you to see the following video https://youtu.be/EnsB1wP3LFM
This video shows how simple it is to simulate dynamical systems, such as the Lorenz system, in Matlab, using ode45.
These lectures follow Chapter 7 from:
"Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz
Amazon: https://www.amazon.com/Data-Driven-Science-Engineering-Learning-Dynamic...
It's on the Lorenz system and he uses ose45
Ask about matlab in #computing-software
Here, I think they just want you to use your knowledge of convex functions to solve this. Part a) should be easy as you just need to identify the constraints and objective function. Part b) you can first prove that F(u1,u2,...,un) = f(u1)+f(u2)+...+f(u_n) is convex if f is convex, then it will help to consider first R^2 case and what happens when you maximize F convex on a line (which generalizes to R^n)
part c) is just the opposite of part b), i think. could be wrong
so for example to get started, in part a) the objective function is $\sum_k f_k(u_k)$ and the constraints are $u_k \geq 0$ and $\sum_k u_k = b$, which i think is all you need for optimality equation, unless ur definition is different
Anticipation
ty for replying, i thought i was gonna be stuck on this one forever haha
what you said is probably right but my lecturer does all of the tutorial questions by stating the "states" and "actions" and then making an optimality equation out of those
so here's an example that he did
i think only the first part of the solution is relevant to this which is just defining the optimality equation
so basically i'm meant to define something in the question as a state and another thing as an action
i figure now that the DP problem has infinite stages and each u_i i'm "investing" is a stage
ah ok. tbh my understanding on dynamic programming is very basic, i think edd would know alot more to answer u, though doesnt look like hes active these days...
i don't blame you for not studying it, i realise now why nobody on my course chose it lmao
after i finish this class i'm staying away from it cause i don't understand any of it haha
I've been exposing myself to new numerical methods lately. Things like multigrid, or LOBPCG are new to me. I've been looking into good matrix-free methods for solving eigenvalue and generalized eigenvalue problems (particularly for sparse systems), and I was wondering, is multigrid PC + LOBPCG state of the art? If not, what is?
For what type of matrix?
you mean other than sparse?
Yes, is if for a PDE discretization? What kind of geometry is it in that case etc.
gotcha. It's for PDE discretization, Schrodinger-type equations. Geometry will be some kind of pre-defined volumetric shape, depending on user input (e.g. cube, rectangular prism, or other; there's like 15 cases)
I just published https://link.springer.com/article/10.1007%2Fs11075-021-01130-9 and same idea works for complex spectra. This works for Toeplitz-like matrix sequences in a very wise sense (so almost all PDE discretizations work, as long as you can define a sequence of matrices + the resulting symbol is monotone). Works for generalized eigenvalue problems. And also works for eigenvectors.
You will probably have this type of matrix sequence then, this method will be orders of magnitude faster than anything else available as far as I am aware
especially for non-Hermitian sequences
I have examples where I get results in seconds where normal methods will take years (estimating)
If you have a code for say a hypercube finite differences send me a code that generates the matrices and I can give you the symbol
Checking it out now, I'd like to wrap my head around the idea real quick
You can check the intro of my thesis, pretty short and to the point www.2pi.se/thesis.pdf
the general idea is: you can approximate a function g(t)=c0(t)+c1(t)h+c2(t)h^2+... and then for any size matrix An you just construct a grid t_{j,.n} and the eigenvalues of An are approximately g(t_{j,n})
so to compute eigenvalues you just evaluate a polynomial
Alright, checking out the thesis. Meanwhile, the idea seems reminiscent of the Kernel Polynomial Method (KPM). Any comments in that regard?
never heard of it, will google 🙂
thnx
the idea behind what i do comes from operator theory (one of the first papers was by Böttcher)
do you know if they have any code?
reading through the intro rn of the thesis. for KPM, there are a couple implementations out there specific to the condensed matter physics literature, so for ex. the implementation I know of is in: https://docs.pybinding.site/en/stable/tutorial/kpm.html
Approximating various functions using the kernel polynomial method (KPM)
KPM is a method for approximating the spectral density function of matrices
It has similar speedup to what you said, solving for full spectra (approximately) of like 2Mx2M matrices in seconds
sounds related 🙂
From what I remember, KPM involves using a Chebyshev polynomial expansion of the matrix or something, and a stochastic sampling of matrix-vector products to approximate the spectral density, they get quickly convergent results
Your work is saying not even matrix-vector products are needed, interestingly
the only thing needed is input: n0 and alpha, typically alpha is 3 and n0 is say 100, then you need matrices of sizes 100, 200, 400 (and in one version of method also 800) compute eigenvalues of those and that is all you need to get eigenvalues to machine precision for any size, also works in any number of dimensions
done on FDM,IGA, DGM, FEM, would work fine for FVM too
If you have some matrices I can do it on them and you could compare with this KPM 🙂
btw if you work on non-Hermitian problems, then never trust LAPACK, it is more wrong than right
Oh my god the shade
suppose one didn't have the matrix stored explicitly, but can construct the matrix elements; or in your case, the diagonals for the Toeplitz matrix
i just need a black box that returns numbers in a "correct order" it can be eigenvalues, singular values or even elements of eigenvectors
so say I want to compute for bilaplace, then i wite a function bilaplace(n) that returns the sorted eigenvalues of the nxn bilaplace matrix. and that plus these two paramaters alpha and n0 is all that is needed
maybe I'm misunderstanding. In order to compute the eigenvalues, you need the eigenvalues as input?
presumably the eigenvalues as input are on a coarse mesh or 'small' version of the problem?
yes. I use a standard )most often high precision) eigenvalue solver to compute the eigenvalues of say 3 small matrices. then i can extrapolate this information to any sized matrix
the error will be O(Ch^(alpha+1)) where h=1/n for the matrix you are interested in
I see. This works because of the special structure of Toeplitz matrices.
no
(no?)
if it is a Toeplitz-like then one can save computations. But ok, yes in a sense. One have to be able to define a "sequence of matrices" and that is trivial for Toeplitz matrices
invert a Toeplitz, and it is not Toeplitz
works for that too
so, it ill work if you can define any sized matrix in the sequence you are interested in
Okay, I have two contexts in mind. One is a "regular PDE" system that will solve equations of the form
$$ \laplacian \psi(\vec x_1, \vec x_2, ...) + V(\vec x_1, \vec x_2, ...) \psi(\vec x_1, \vec x_2, ...) = E\psi$$, where the potential $V$ is pairwise between the x_1, x_2, etc
MellowFermion
Compile Error! Click the
reaction for more information.
(You may edit your message to recompile.)
This is e.g. quantum chemistry PDE
The sequence of matrices could naturally be the grid discretization
yes
The second context is in many-body condensed matter physics, where the only natural "sequence of matrices" we can make is "adding particles" via tensor products with say 2x2 matrices (in the simplest case; NxN more generally). They may be sparse but these matrices are I think fairly far from Toeplitz
(I may be better off sending matrices, huh?)
that would be nice 🙂 I will have a simple example here though: say you diszretice the negative laplace ian thenyou have the matrix 1/h^2 Toeplitz(-1,2,-1) where 2 is on the main diagonal. then you need to rescale the problem by multiplying by h^2 otherwise the spectrum blows up as h-> 0 so one looks at just Toeplitz(-1,2,-1) and this trivial example we have the symbol f(t)=2-2cos(t) and the exat eigenvalues are given by t_j=(pi j)/(n+1) if we now want to look at 2d laplace which is not Toeplitz the you just add a symbol in each timenstion that is g(t1,t2)=f(t1)+f(t2)=2-2cos(t1)+2-2cos(t2) which can be directly translated to the correct 2D laplace matrix. the exact eigenvalues can be retrieved exactly by sampling this 2d surface. I would think your second example would behave like that
to me the physics connection just confuses me 🙂
your example for 2D Laplace makes sense for separable PDEs like Laplace.
What about non-separable PDEs?
I would think it would work, I have not looked at an actual PDE but just "model problem symbols"
so i would very much appreciate if you have the time to give an example 🙂
(i can not promise to look the coming weeks I have teaching that starts i a week)
i'll work on making an example matrix. does the MatrixMarket format work?
sure
but would be very nice if i can generate "any" size
because i need to actually look at a sequence
but if you make a sequence of say 10 different sizes it is ok
the second context I mentioned would necessarily have sizes 2^N
that is ok 🙂
I have worked with space time discretizations where there are infinitly many sequence that can arrive the the matrix of interezt
Is LAPACK lacking for non-hermitian problems
Yes a couple of examples in https://www.2pi.se/publications/pdf/p13.pdf That is why people typically use pseudospectra instead of eigenvalues for analysis. The only "normal" remedy is to use higher precision solvers. For example the main matrix they look at in https://doi.org/10.1007/s00020-020-02619-z you need somewhere between 2048 and 4096 bit precision to get 16 digits of accuracy for a matrix of size 1631 (and that takes like 2 weeks on a threadripper 3970X). It is even worse than the Grcar matrix which is a typical model matrix to break solvers (you can find it in MatrixDepot)
double precision is 53 bits
I wonder if Demmel has heard about this
He + co are working on a new release of LAPACK
But it seems like the focus is mostly on randomized NLA
Yeah, it is well known.

but i know quite a few papers when authors do not know this 😛
Well
It might be too late for this upcoming LAPACK release
But if you send Demmel an email
I'm sure he'd want to hear more
there are no double precision algorithms that work as far as i know
I use generic algoritms in julia
Most of his research in recent years has been more focused on communication avoiding algs
I hope to have a talk in Stockholm where https://www.su.se/profiles/zhba2227-1.432196 listens 😄
he is also one of the lapack people
Ah ok
So this is what I use https://github.com/RalphAS/GenericSchur.jl
@fathom rain I found a non-physics review of the "spectral density estimation" methods like KPM I mentioned earlier:
Lin, Lin, Yousef Saad, and Chao Yang. "Approximating spectral densities of large matrices." SIAM review 58.1 (2016): 34-65.
In particular, I found something that you may be interested in:
If I understand it correctly, for your work, instead of (3.14), you get f(t) from the eigenvalues of smaller versions of the matrix involved
Might be, I have to look into this, thanks for bringing it up. It is definitely some overlap, I just have to understand the approach and notation. So, my work is an extension of http://library.lol/main/34B1568B1BAF015A21F8B15577CBD5AE , going from O(h) to O(h^(alpha+1)) approximations.
And "my" approach also works for non-Hermitian matrices
As I understand the above is for Hermitians?
and regarding non-separable PDE, currently working on a paper on variable coefficient diffusion, works fine
the discussion in the screenshots above was for real symmetric matrices. Not sure if it's restricted to that by necessity or from ease of math.
I'll read that paper carefully in a week or two, thank you very much 🙂
another related field is microlocal analysis
all this goes back to Weyl, Szegö, Toeplitz
Oh Lin Lin is cool
one of my professors mentioned a method for finding roots of a quadratic involving arctangent but I couldn't find anything about it online, does anyone here know what that might be?
maybe I should have added "I think". I do not know micro local analysis but as far as I understand it revolves a lot around the asymptotic spectral properties of operators. Maybe I have misunderstood. 🙂 I tried reading into it a few years ago but found the notation quite hard to read, and then I had too much else to do 🙂
so many things to learn, so little time...
Here is a KPM implementation in Julia https://github.com/pablosanjose/Quantica.jl
I talked to the Julia package developer and he said "Well, KPM is not so good to get you precise eigenvalues, its mostly geared towards densities. If eigenvalues are isolated, you will get peaks in the eigenvalue density telling you where they are" So one can not retrieve the eigenvalues with more than O(h) from this as I understand. I have one example from him, but his ordering of nodes is not lexicographical/natural so I have to find a permutation to define the symbol. Again, thanks for making me aware of this. If I can use the expansion of these matrices I can probably make some physics people interested 🙂
I'm studying some numerical analysis approach to ADR equations. My professor commented this images saying that there is an "overshooting" in the 2nd/3rd images on the right column, but I can't seem to find a proper definition/interpretation of overshooting. Any idea?
What are these plots of?
Numerical solutions of this using DG (Left) and SUPG (Right)
gibbs phenomenon
you can use subcell shock capturing, artificial diffusion and other techniques to fix that
If you use DG you can have a shock detector and then just refine grid around the discontinuity and lower the order to 0 on the shock
that was very helpful, ty
See page 11 for an example of exactly that https://www.global-sci.org/v1/cicp/issue/FULLPDF/5/456/paper.pdf
i was thinking about what you said and i might still be able to apply that to my answer now that i've attempted part a, how could i prove what you said?
ok so if F(u) = f(u1)+...+f(un) is convex then do you see why F(b,0,...,0) = F(0,...,0,b,0,...,0) = F(0,...,0,b) = f(b) is the maximum?
you are maximizing $\sum_i f_i(u_i)$ right? So here I'm just writing F as this sum
Anticipation
yea sorry i guess i wasn't clear before
all good, ty for helping me understand
actually sorry, i thought i understood but idk if i'm right, could you explain this part again?
hm I actually haven't reigorously tried the R^n case. For R^2 its like this
$F(\alpha (0,b)+ (1-\alpha) (b,0)) \leq \alpha F(0,b)+(1-\alpha)F(b,0) = f(b)$ for all $\alpha \in [0,1]$
Anticipation
and note that ${\alpha(0,b)+(1-\alpha)(b,0) \colon \alpha \in [0,1]}$ is the feasible domain
Anticipation
For R^N, show any x in the feasible domain is in convex hull defined by the set {(b,0,...,0),...,(0,...,0,b)}
and try using the convex property for finite amount of times to reach the result
man idk how you know this stuff, this stuff is all so abstract to me
you get the R^2 case right?
i think im just using big words... its actually the same process in R^N
idk if i get it hahaha
most of the maths i've been doing for the last year has been stats and i don't even need to understand it half of the time because i'm following strict processes to get results
ah its ok, the idea is you want to show F(b) is the maximum right
yeah
feasible domain just means like the set of possible u = (u1,...,un), given the constraints
ok that makes sense
ok so then you want to show F(u) ≤ F(b) for all u in the feasible domain
and the constaints are the definition of the convex function right?
i mean the function is defined on the domain
right
So then to show this, you choose any u in the feasible domain, i.e. lets consider R^2 first
then u = (u1,u2) where u1+u2 = b and u1,u2 ≥ 0
ok i'm with you so far
hm lemme think of how to write it lol
sure
$\begin{pmatrix} u1 \ u2 \end{pmatrix} = \alpha \begin{pmatrix} b\ 0 \end{pmatrix} + (1-\alpha) \begin{pmatrix} 0\ b \end{pmatrix}$
Anticipation
you can solve for alpha here
since once you solve u1 = alpha*b, then alpha = u1/b also works for the 2nd line since u2 = b-u1
and then apply F on the RHS and use the convex property directly
to show F(u) ≤ F(b)
what do you mean apply F?
sorry for not understanding this btw ik this must be frustrating haha
I mean $F(u) = F\left(\alpha \begin{pmatrix} b\ 0 \end{pmatrix} + (1-\alpha) \begin{pmatrix} 0\ b \end{pmatrix}\right) \leq \alpha F(b,0)+(1-\alpha)F(0,b) = f(b)$
Anticipation
oh wait when you write it that way i'm with you
sorry i think what throws me off is the notation half of the time rather than the actual concepts themselves
yea notations r weird haha, I use my notation cuz i feel its more compact
anyways for R^n i'll just say this
yeah i'd write that way too now i understand it
so can i say that u_1/b * F(b,0) = (u_1,0) or is that wrong?
No I think you are done once you show F(u) ≤ f(b)
oh ok
because you know F(0,b) = F(b,0) = f(b) and what you wanted to show was f(b) is max
ok ty so much i should be able to write things out from here and apply it to what i got in part a
yea ill be busy now so i cant help until tomorrow
np i appreciate you helping me out, have a good day my friend
hey, I need to prove this. I already figured out i => ii. In the solution for ii => i they start with Lipschitz continuity but don't use the absolute values. Why can they do this?
looks like they forgot it
yeah they may have forgotten, although what they've done there is valid since f(z) - f(x) <= |f(z) - f(x)| <= etc
we don't have the rest of the proof though so it's hard to see if it's a typo or actually used like that
This font makes me cringe e
I'm glad that they publish solutions, so I don't complain about fonts ¯_(ツ)_/¯
This might be a stupid question, but why do we use gradient descent to approach the minima instead setting the gradient equal to zero and solving that equation iteratively?
they're the same thing ma boi
that's exactly what gradient descent does
i think the formulation that makes this the most clear is to take the original cost function and do a first order taylor approx. then take the proximal of that
you can easily see that you're looking for a quantity that sets the gradient to 0, and doing so iteratively by linearizing the original cost function and taking small steps at a time
you can also do something similar to the grad eq you mentioned, too, which is the same procedure but using a 2nd order approx. check out newton's method
Is this the right channel for help with network theory related content?
Hey, I need to prove this. They do this by computing the derivative of $h$ and evaluating it at $\alpha=0$ and then conclude that since $h'(0) < 0$ there is always a sufficiently small step. I don't understand that argument, why do they evaluate at $h'(0)$?
lyinch
Interesting paper that I came across today
(Approximately) multiplying matrices without multiplication
interesting, but i'd wait til it gets peer-reviewed and published
Hi, can anyone tell me how I can do this in Matlab
b(1, 1) = [0, 0, 1];
So basically, I have an nxm matrix, and elements of that matrix should be 3-dimensional vectors
In Julia
julia> A=[1,2,3]
3-element Vector{Int64}:
1
2
3
julia> B=[[A] [A]; [A] [A]]
2×2 Matrix{Vector{Int64}}:
[1, 2, 3] [1, 2, 3]
[1, 2, 3] [1, 2, 3]
julia> B[1,1]
3-element Vector{Int64}:
1
2
3
In matlab I think you need a 3D or 4D array for it like here https://se.mathworks.com/help/matlab/math/multidimensional-arrays.html Maybe use a cell array?
you can generate an n-way array using zeros(), and then assign slices
matlab doesn't deal with multiway arrays natively so nicely
but yes, wrong channel
thanks
truncation error comes from Taylor's theorem (to second order) - try finding an expression for f(x+h) in terms of f(x),f'(x),f'' if you want to see where it comes from
and the h they used to minimise error comes from minimising the maximum of the two errors above, by setting Mh/2 = 2ε/h and solving for h
Is this also true if you replace 1/n by 1/n+1 since that is the error term for ln(2)?
I'm assuming you mean replace 1/k by 1/(k+1)
In which case you would replace ln2 with 1-ln(2)
Because you switch all the signs and lose your first term
$\mathcal{O}$ is asymptotic notation so $1/n$ or $1/(n+1)$ doesn't matter
Sven-Erik
I think so, but not 100% clear 🙂
Yep, sorry about that
Hi folks. Would a question about SEM belong in here or elsewhere?
what is sem
seems like a good match as long as the problem can be formulated adequately
idk if ppl here have first-hand experience with that
for the most part it looks like parameter estimation
so to some extent it might also fit in advanced prob
Thank you, glad to know I'm in the right (or one of the right) channels.
Hey, is the control net for Bezier surface obtained by connecting points that are adjacent in the control matrix?
I guess it is
Would someone double check this for me? It's for a personal project. It's a discrete Morlet/Gabor wavelet transform, and I'm not sure I'm getting the right frequency response. I'll reply with a few plots. The signal is 0.5 * (sin(2pi441x)+sin(2pi1190.7x)). Thanks
Absolute value of the wavelet transform
Real part of the wavelet transform
Imaginary part of the wavelet transform
Absolute value of the Fourier transform
<@&286206848099549185>
The issue is that I checked what the ideal response should look like, both, numerically in Desmos, and using https://www.integral-calculator.com/, and they both suggest that I shouldn't get an oscillating frequency response (the 2nd and 3rd images show oscillation).
ofc the ideal response would nornally be represented with 2 dirac deltas, no ripples
but you have several problems
the samples are discrete, and you probably chose an arbitrary chunk of samples
your freq response is then resolution limited bevause it's like multiplying the time domain by a few heavyside step functions
this is the same as convokution in freq domain by sinc functions, which is exactly what you see in the fourier spectrum
this is because the dft is a transform for periodic functions. you can't just truncate the signal willy nilly
as for the gabor one, you decompose the spectrum into gabor functions
ofc you get gabor functions :p you told it to use exclusively that
I understand I won't get dirac delta response, but I think I'm supposed to get something close to a Gaussian curve (like the first image except without having to take the absolute value). I also, use constant resolution (the same integer number of periods for all frequencies).
this is exactky thw problem
Are you saying that whatever function f(x) I use, the output will be that f(x)?
can you post the plot?
and yes
you are taking projections of the signal onto your f(x)
what you get are scaled copies of f(x)
that's what these transforms do
show the time dom plot
The real part (red) is 0, and the imaginary part (blue) isn't oscillating.
i'm not sure that is giving you the right result
that curve is a function of n, f, and x
Hmmm okay
I did. f=10 and n=10 in that example.
Okay. My computer's slow, so it may take a few minutes lol.
I just got over the range 390<x<410. I must be doing something wrong in Desmos lol.
how about just 100
i'm also still waiting for the time domain signal lol
what did you mean by having the same number of cycles for both?
For each frequency, I would set the resolution such that the kernel would span a certain number of periods. So if I checked 1 second for 20Hz, then I would check 0.001 seconds for 20000Hz.
this is exactly the same as applying a shorter window function to each tone
it is also the reason the tone on the right looks wider
it makes the resolution get worse the higher the frequency
unless you transform the frequency axis into log domain or something with a semilog plot
This is the best I could do (f=30). Note how it just cuts off after a while.
still doesn't make sense. the product of sines will give you 2 cosines of different frequencies
I'm thinking maybe desmos is just wrong? We're both looking at the same thing, and it doesn't make sense to either of us.
i'm pretty sure that is the case
Okay.
I guess that answers my question, then. Thanks for your help.
Yeah #computing-software
🙂

Best ask in the ML server linked in #old-network
Hey I saw this in a matrix computations book and I was wondering if any of you could show me why this is true?
Multiplying on the right by A+E
Hi guys, I'm working on optimizing the prolongation part of a geometric multigrid scheme, and in doing so I'm computing the error propagating matrix from one multigrid iteration to an other (If we call it $M$, then $M=S\cdot C\cdot S$ )
And when computing this $M$ matrix I'm confused as to what to use for it: Since I use Gauss-Seidel to relax before and after doing the coarse grid correction, I don't know whether to use $S = I - (D - E)^{-1} A$ or $S= (D-E)^{-1}F$ (Where A my discretisation matrix is decomposed as: $A=D-E-F$ with $E$ and $F$ being the lower and upper parts strictly)
Anarchy
A bit delayed, but if you can't figure which S to use with math, you can always test them with numerical experiments
yeah actually I figured out that both have the same effect on the iteration error for a miltigrid scheme
What is the x axis
Number of grid points?
Hmmmm
What if there are floating point issues at play
You can’t expect anything better than 10^-15 right
And then introducing more points just means summing more floating point errors
Because L1 has slope 1 roughly and L2 has slope 1/2 it looks like
Has anyone ever used local fourier analysis as described in this paper? https://arxiv.org/pdf/1803.08864.pdf
section 3 more precisely
I need help understanding how you can go from a stencil based discretisation to a Fourier symbol on the frequencies described
What do you call the error? $| \bar{u} - u^{[k]}|_{i}$ for the $i^{th}$ norm?
Where $u$ is the solution?
Anarchy
what kind of type are you using for your data?
float32? float64?
In that case I don't think it's a numerical error
because numerical errors with float64 numbers occur when you add up numbers of orders of magnitude bigger than an order of approximately 10^5
That's not true
give me a code which generates the matrices and I can give you the symbol
In this case, you aren't getting floating point errors from adding numbers of very different magnitudes
(I really dislike the LFA approach to find it)
You're getting floating point errors because you're adding up a lot of small errors
So like the L^1 line is approximately (number of grid points)*10^-15
Because you can't get better than 10^-15 error in each individual component
And the L^2 line is approximately (number of grid points)*(10^-15)^2 because you have 10^-15 error at each grid point and then you square that
Or 10^-16 or whatever
To generate the local Fourier transform you do the following on the stencil:
For frequencies $t_x$ on $x$ and $t_y$ on $y$, and the stencil $A_s$:
import numpy as np
grid_size = 8
tx = ty = np.pi/grid_size
As = np.zeros((grid_size, grid_size, 3, 3), dtype=np.complex128)
As[..., 1, 1] = 4
As[..., 0,1] = As[..., 1,0] = As[..., 2,1] = As[..., 1,2] = -1
X, Y = np.meshgrid(np.arange(-1, 2), np.arange(-1, 2))
fourier_component = np.exp(-ci * (tx * X + ty * Y))
fourier_component = np.reshape(fourier_component, (1, 3, 3))
As_fourier = As[..., :, :] * fourier_component
That's an example with the Poisson problem stencil and $t_x = t_y = \frac{\pi}{n}$ where $n$ is the number of points
Anarchy
Sorry, but LFA is stupid 😛 use GLT. This stencil approach is just insanely complicated compered to just use Kronecker products of one dimensional symbols
I agree but it's part of the paper I'm studying 😦
I worked with LFA people for 8 months, their derivations took several black boards, mine was one line
and I actually have to justify how it works and show why it gives me a block circulant matrix at the end
oh wow
I could seriously not understand why they were doing what they were doing, so I didnt even attempt to understand in the end
I mean... The whole paper is centered around computing the Fröbenius norm of the local Fourier transform of the error propagating matrix instead of the spectral radius since it's computationally less expensive
you can do the same with GLT and it is not limited to circulants/periodic BC
the periodic BC is imposed to simplify computations apparently
and you typically derive the same symbol (in non-Hermitian case you get the wrong symbol in LFA)
how so?
I do not know, since I do not know LFA, but just by looking at a matrix I could see the spectrum they claimed the matrix had was not possible. And I could show that numerically very easily.
(and my approach with GLT actually got an explicit expression for all eigenvalues)
(this is not published yet, I hope to write it up this winter)
alright, and you can have a meaningful boundary of the spectral radius with that?
where do I sign up? XD
I do not know what you are doing, maybe some thesis project, then keep doing LFA, but just know there are much easier tools out there
Thing is my supervisors never used LFA and are asking me to read papers to justify using it
sure I was working on Parallel-in-Time matrices (in Wuppertal) and we get very nice bounds on that
and so far I can't, for the life of me, understand anything
maybe something in this one? https://arxiv.org/pdf/1706.06844.pdf
thanks
See that matrix ? I have to minimize its spectral radius, if you were asking what I specifically have to do
thanks again, really helpful
Yeah it was something like that I did (haven't worked on it for over a year so do not remember any details)
(what I had to do, was to do a similarity transformation to instead of many small blocks, to a 2x2 block matrix, and analyze each subblock, so I found symbols for each subblock)
no idea if that is a viable approach for you
maybe you should listen to your advisor instead 😛 probably there is some LFA methodology for this
I sat pretty much constantly for 4 months to find the full decomposition of the matrix I worked on (other difficulties since it was a space time matrix so 2 different parameters)
but once we had the techniques it is pretty strainght forward in the future for similar matrices
Right, I see
Thing is none of my supervisors have ever used LFA
and they are asking me to explain it to them
which I'm incapable of
Isnt there a book on LFA I think I saw that somewhere w8
and this isn't a thesis project it's my end of studies internship, but I've applied to PhDs, only in France for the moment
if you got any interesting thesis subjects don't hesistate btw
I am fully booked until fall 2022 :/
read that one, not really
alright, thanks anyways:)
but if you have actual code to generate the matrix you are interested in maybe I can have a quick look. If it is not spacetime then it is easier
becuse the code you had before was some kind thing to get the symbol right?
it's only space, Darcy problem
yeah exactly
1d or 2d?
2d
Anyone acquainted with automatic adjoint differentiation?
acquainted, yes. haven't done it myself tho.
Do you have some references for it?
this is on general automatic differentiation https://www.jmlr.org/papers/volume18/17-468/17-468.pdf
How would I go about generating a sample path of a geometric brownian motion with predetermined start and end prices?
you migth be interested in what we call brownian bridges
Hmm interesting, I might try that 👍
I don’t immediately see how this helps 🤔
Actually looked into it and that worked for me thanks
that's the algorithm I used if anyone is interested:
def generateBoundedBrownianBridge(T, sigma, start, end, dt):
'''
Generate a bounded geometric brownian motion making use of a brownian bridge.
Params:
T: time horizon
sigma: volatility
start: start price
end: end price
dt: time steps size
Returns:
t: time array
S: time series
'''
S0 = start
mu = (1/T)*np.log(end/start) + (sigma**2)/2
N = round(T/dt)
t = np.linspace(0, T, N)
W = np.random.standard_normal(size = N)
W = np.cumsum(W)*np.sqrt(dt)
W = W - (t/T)*W[-1]
X = (mu-0.5*sigma**2)*t + sigma*W
S = S0*np.exp(X)
return t, S
from a code quality perspective, you could split it into a wiener process method, and then only create the bridge whose impl is now trivial ;)
when ur implementing gauss jordan elimination algorithim in a machine
u multiply row by scalar to get pivot points and subtract it by other rows
this leads to cancellation and some rounding error, is there a way to minimize this cancellation
error or prevent catastrophic cancellation?
you can try to precondition the system
also certain row orders behave nicer than others
so do u partial pivot first?
to minimize error?
iirc partial pivoting the row w the greatest infinite norm to the very top of the matrix
and repeating hte process every time
u reduce for a pivot
minimizes error
i'd be lying if i said i know the specifics, it's been a long time
num anal is so weird, the basic ideas on how u solve a system is p simple
but it feels so weird cuz u have to consider rounding error and all sorts of diff types of error when u implement algos to solve a problem which u dont really do too much inother math clases
You should read chapter 2 of Demmel’s Numerical Linear Algebra
It is a good book but might not overlap exactly in content
Gaussian elimination over Z2 can be sped up by “word size” times with bitsets
.
Word size is typically 32 or 64
I need to check if $$ 2^{-1}+2^{-29}$$ can be represented in single precision floating point , i got told i needed to change it to base 2 binary so i could see if it is possible , this results in $$0.10000000000000000000000000001$$ now what part of this base 2 is the exponent and mantissa?
Zeffs
How are floating point numbers represented?
Like if you have a mantissa and exponent what number do those correspond to?
well 1 bit should be sign, next 8 bits should be exponent then the last 23 bits should be mantissa
Oh are you using single precision
ye
Ok so those are how many bits are allocated to each
But how do you recover a number from that
and i need to check if $$ 2^{-1}+2^{-29}$$ can be represented in this machine number
Zeffs
Yes I know this
Like those are what the 32 bits are
But if I give you an exponent and a mantissa, what number do those correspond to?
Like if the exponent is 000010000
And the mantissa is 01010101010101010101010101010 or however long 23 bits is
cant i just check if the mantissa is longer than 23 bits
Ok is the mantissa for 2^-1+2^-29 larger than 23 bits
thats what im unsure of, it should be readable from the base 2 conversion of that number
but idk how to read it
another person said it was like this: exponent = -1, mantissa = (27 zeros)1) which matches with the base 2 conversion, but im not sure why the exponent is -1
What is the mantissa of the number 2
number 2?
1?
Unlikely
wait no its 00000000000000000000000
What
yes the mantissa from base 10 decimal to single precision
Hmmmm
Ok how about this
The mantissa is 23 bits right
So it covers a range of 23 powers of 2
2^-1 and 2^-29 are more than 23 powers of 2 apart
yes so it cant be represented
Yeah
but im unsure of how its read from the base 2 conversion of 2^-1 and 2^-29
$$0.10000000000000000000000000001$$ is this full thing the mantissa?
Zeffs
I guess to read it from the base 2 representation you just see that the base 2 representation is more than 23 digits long
yeah but thats what dont make sense it my mind , cause 23 bits is only for mantissa but the base 2 conversion should also include the exponent right
Sure but the exponent only changes the position of the decimal point in the representation
It doesn't change the length
oh
The exponent is just multiplying by a power of 2 right
Which is equivalent to moving the decimal point around
In a base 2 number
so it makes sense to say that the base 2 representation is equal to this : (exponent = -1, mantissa = (27 zeros)1)?
Wouldn't it be 1(27 zeroes)1?
i guess, maybe it was a typo on my friends part
hmm u said that the decimal gets moved around but it doesnt
Anyone can help me with convergence of gradient descent algorithm?

Presumably you have looked at the convergence analysis?
ya
was following the book "An introduction to optimization" by Chong
where is this V(x) coming from?
You are familiar with how a matrix Q induces an inner product right
hmm
If it satisfies certain conditions
Q > 0
Positive definiteness
And symmetric
Then you have $\norm{x}_Q^2=x^TQx$
And you check that this this is a norm
ya
$V(x)$ is just the norm of $x-x^*$ with respect to the norm induced by $Q$
Angetenar
Well with a 1/2
Angetenar
And then you have a 1/2 as a scaling
Because when you differentiate 1/2x^2 this becomes x
So it's just avoiding coefficients later on
but how did we go from $f(x)+1/2 x* ^t Q x $ to the term on the right
What is f(x)
objective function
starts with arbitrary function and says it'll be easier if we analyze the RHS function instead of that one
that's where I'm confused
Do you have an explicit form for f
no
been doing that
Ok great
so I shouldn't try find any correspondence ?
like just ignore as if it was never there


Sure
hey guys i posted a channel in a different channel but it could apply here too
whats the relationship between biconnected components in a graph and the number of vertices?
i know its a factor of n because articulation points overlap but im not sure beyond that
i have somewhat of an idea
I know algorithms that count/find lattice points in polytopes exist, but does anyone know of any such implemented algorithms?


