So I’ve seen a bunch of names in here like spectral methods, finite element methods, finite difference methods (and have no clue how exactly they work except from a short Wikipedia explanation)…
Are they worth looking into to achieve higher computational performance for similar accuracy on large systems of ordinary differential equations (that can be expressed very well as a single equation with lin alg)? Or does that completely depend on what the DE looks like
#numerical-analysis
1 messages · Page 8 of 1
depends on the ODE
Spectral methods is very fast
and exponentially convergent
but doesnt always work
Finite difference method is based on usual approximations of derivates (like (f(x+h)-f(x))/h)
Finite element method can be used with different meshes and situations where finite difference wouldn't be handy
Basically if your domain is not a simple geometry where doing finite differences is possible and easy to do, finite elements are easier to use
If your domain is a square, it doesn't matter too much, but if you have to do fluid simulation with a car 3d model, it does matter
Daaaaaaamn
If I want to learn about them, should I just download some book about numerical analysis? P.S., what're the prerequisites that I should know beforehand? Real analysis? Some other stuff?
Linear algebra, knowing what a basis is, specifically a basis function
Chebyshev polynomials
Gauss Jordan and other linear equation solving algorithms (perhaps LU Decomposition)
Essentially the idea of Spectral methods is you assume an approximate solution: [y_N=\sum_{k=0}^Na_k\phi_k(x),] where $\phi_k(x)$ are basis functions that you choose, e.g. the Chebyshev polynomials and $a_k$ are unknowns that you are trying to find. Then to find $a_k$ the most popular method is collocation. So you force $y_N$ to satisfy the boundary conditons, such as $y(0)=1$, which would give you: [1=\sum_{k=0}^Na_k\phi_k(0)] which although it doesnt look like it, is an equation in the $a_k$. Then you force $y_N$ to satisfy the ODE at particular points (which again you choose)[D(y_N(x_i))=0], where $Dy(x)=0$ is the differential equation and $x_i$ is the collocation point you chose.
Max
Interesting
Although it sounds kind of unintuitive that that would necessarily converge to the correct solution at all
Have you done function approximation theory?@wicked pecan
Like using a Taylor series or Fourier series, then later Chebyshev series
Assuming enough smoothness, it will converge to the strong solution.
By approximation theory.
Essentially you are pretending you have found an approximation of the exact solution in some basis. Then you force that approximate solution to satisfy the ODE as best as possible
If you can get something like a Galerkin orthogonality, it is even easier to prove convergence.
We can extend the argument to nonlinear problems using fixed point theorems.
A little bit… I know Taylor Series, I know what Fourier series does, haven’t heard of chebyshev series (I assume it’s like Fourier series but it adds up chebyshev polynomials to get the function/signal?)
Nothing rigorous though in that area
The “pretend” part there kinda is kinda throwing me off
The basis are polynomials. We can approximate any smooth function using polynomials.
And very well for that fact
For certain functions to produce an approximation where you can see basically no error you only need like 4 or 5 basis polynomials
The convergence is much faster than a Taylor series
Are spectral methods like the svd or fourier coefficients? For example with the svd, the larger singular values are associated with lower frequency and don't cross the x axis as much as the smaller values that have a higher frequency and do cross a lot?
Has anyone here done any tikhonav regression with bayesian prior knowledge in the aprori part?
For example |Ax-b|^2_2 + lambda^2|L(x-x*)|^2_2?
I'm wondering if anyone would know a good book that shows this with analysis and applications where you use the normal distribution of A in the aprori L and x*
Surround the equation with $ $
Max
I haven't done SVD. They are similar to Fourier coefficients (well Fourier truncation)
Fourier coefficients and truncating the fourier series is a type of spectral methods where you use a trigonometric basis.
So it's essentially any type of basis that is like a sinosodual basis
Any basis: polynomial basis, rational basis, fractional basis, fractional rational basis, cardinal basis ext.
Yes, but does the behavior model a trigonemetric one
Wdym?
The lower elements of the fourier coefficients cross the x axis less as in their lower frequency elements.
While the higher frequency elements of the fourier coefficients cross more
The svd basis is called a spectral basis in my book I'm reading Because it models this behavior
This is a good way to see it
Yeah I understand how that would work but what does this have to do with ODEs?
I was asking if that's what a spectral method is. A basis that models this behavior
Sorry
You could use a basis expansion of a function to approximate those functions, which is essentially what spectral methods do for the exact solution of an ODE/PDE
However for the examples you showed above there might be problems because of how much the curves fluctuate
That's why you find a good cut off to minimize the perturbation and regularization error 🙂
Yeah fair enough, as they are periodic you'd probably just use a trig basis
Actually the svd term U^tb is called the fourier coefficient in this book
I believe theres a relationship too the generalized form
thanks max 🙂
👍
Not sure what that is but I'm sure once I get around to looking at this area it'll make sense
This is not true For example with the svd, the larger singular values are associated with lower frequency
The svd is the discrete form of the sve. It approximates the sve values. So it should have it's properties
I'm a hobbist, I could be wrong. 
If you could look at this and see that the image shows what I Said too
Anyways if you think im still wrong after this I'll believe you
If you by "high frequency" mean oscillating elements in the eigen/singular vectors just take the Laplacian or biLaplacian and you have low frequencies associated with the small eignvalues (and singular values) (and if you just take the absolute values of the elements it is the reverse)
just do the decomposition of of toeplitz(n,[2,-1],[2,-1]) and toeplitz(n,[2,1],[2,1]) and compare
Is the laplacian square integrable?
Because this example shows that it doesn't work with the laplacian unless you have the bottom condition.
Okay i think I need to say, what I Said applies only if the picard conditions hold true
where's the laplacian
I totally misread sorry
You're right, I looked deeper at this. I'm talking about a specific case where the discreetization of a Fredholm integral of the first kind has these properties.
I'm sorry 🙂
no need to be sorry, point to be here is to learn new things 🙂
Hi, I have this system of ODEs and an IVP to solve, but I'm having trouble starting and could use a hint
I have a working solver (using RK4), but i'm struggling to actually form the differential equations
I'm given GM, and know that these are ODEs describing orbital dynamics about Earth, so I assume r(t) = (x(t), y(t)) = (Rcost, Rsint), then r'(t) = v(t)
but where do I go from there?
These are the initial conditions
it's already in the form $\frac{\dd \vec{u}}{\dd t} = \vec{f}(u,t), \vec{u}(t_0) = \vec{u}_0$ with $\vec{u}(t) = (x(t), y(t), v_x(t), v_y(t))$, just plug this into runge kutta scheme
Transparent Elemental
ah right, I think I got it now
maybe not... My results seem wrong. Just to clarify, the array of equations would then look like this?
rhs[0] = u[2];
rhs[1] = u[3];
rhs[2] = -(GM*u[0])/(u[0]^2 + u[1]^2)^3/2;
rhs[3] = -(GM*u[1])/ ... ;
are you sure z^3/2 computes as z^(1.5) and not as z^3 * 1/2 or z^1?
I just wrote it like that to shorten it.
I really have pow(u[0] * u[0] + u[1] * u[1], 1.5)
so should compute fine
but that is the general structure right?
yes
I'm trying to add effect of a static moon to the simulation and now have the following
[ \begin{pmatrix} \vdots \ v'_x(t) \ v'_y(t) \end{pmatrix} = \begin{pmatrix} \vdots \ \cdots - \frac{GM_L}{\sqrt{ ((x(t)-x_L)^2 + (y(t)-y_L)^2)^3}} x(t) \ \cdots - \frac{GM_L}{\sqrt{ ((x(t)-x_L)^2 + (y(t)-y_L)^2)^3}} y(t) \end{pmatrix} ]
But the trajectory is barely changing. Am I going about this wrong?
sub L are values for the moon
and the dots are the terms from before
shsgd
\textbf{Task 2 (Householder Transformation)}
Determine a Householder vector through which the Householder transformation $Q$ is defined, which maps
[
a^{\top}=
\begin{pmatrix}
-1\
5\
3\
-3\
-2\
-1
\end{pmatrix}^{\top}
]
to a multiple of $e_1$. Use the cancellation-free variant, even though it may not occur here.
CiaoWelt
Can anyone help me with this task...
By any chance, would anyone be able to help me understand backward's eulers method
what about it
When you use forward euler's to get your values, how do you plug those values into the backwards euler equation?
what values and how is this related to backwards' euler?
are you asking how to run a backwards euler?
for fixed y_n and x_{n+1} this is a nonlinear equation for y_{n+1}, you solve it using algorithms for solving nonlinear equations
so i plug in the function and values to solve for yn+1?
do you know any algorithms to solve nonlinear equations?
not really
well then you can't run it
what kind of algorithms?
bisection, newton's method, etc
ohh, i was wondering if i can use the euler's method to solve the implicit method
the purpose of euler's method is to solve differential equations
not finding roots of functions
no, the lower index refers in y_n refers to value of y at x_n
the upper index would refer to approximate value of y_n as you go through fixed point iterations, which are indexed by upper index
y_n^k is k-th approximation of y at x_n
you fix n = 0 and then start at k = 0 running the iterations y_1^{k+1) = y_0 + hf(x_1,y_1^k) until some capital K, after which you stop and declare y_1 = y_0^K + h*f(x_1,y_1^K)
increase n by 1 and reset k to 0 again and repeat
for K and K+1, i think those are to denote the differences between forward and backwards method right?
no
forward euler is completely trivial, there's no inner loop in terms of k in the first place
this is all related to backwards euler and not forward euler?
unless the bottom picture talks about forward euler and just uses upper index as iteration counter
yes
well why should using different notation change anything, what I said applies in this case
there is no dependency on next value of y in function f in forward euler unlike backward euler, which is why it says
does backward's euler use the value of y from the forward euler to solve the diff eq?
no
so here do i jus subtract 150 by 0.001?
0.150000 x 10^3 - 0.000001 x 10^3
or jus 150 - 150*0.001
Not really a numerical analysis question.
But the way I've been taught to do this is consider the partial sums.
[\left\lvert\frac{p-p^*}{p}\right\rvert<0.001]
Max
The main way I know is to compare to sum of 1/n(n-1) [over n > 1 ofc] which is easy to compute
Yeah that's nice
but we dont know what p* is
do i do for + aswell?
so p + p*
and p - p*?
Nope, $\lvert p-p^\rvert$ is the difference in $p$ and $p^$
Max
In the given quadratcic equation b^2 is much larger than 4ac. You can see this by performing the calculation
So doing 21 a) if both formulas are correct and do the same thing shouldn't they just cancel each other out?
Assuming BOTH equations are correct
it's +x_0 y_0 on the right side instead of negative
Yea I fixed that they all canceled out
Is that good enough for a proof that they are equal?
Because this solution was 10x confusing
well if you're bothered by the reasoning "assuming they're both correct", you can just transform the second form into first form
How do
So*
$x_0 - \frac{(x_1-x_0)y_0}{y_1 - y_0} = \frac{x_0 (y_1 - y_0)}{y_1 - y_0} - \frac{(x_1-x_0)y_0}{y_1-y_0}$
Transparent Elemental
then just expand and subtract
seems like the answer goes into full derivation of the formula
it's just ambiguous in what was really meant by "show they're algebraically correct"
what is numerical stability
would my method still be correct?
like the way i did it and everything canceld out
yes
I am not sure what the best channel for this question is btu I am using it for num-analysis so hopefully it fits here.
If I have a vector $v$ in $\mathbb{R}^2$, the outer tensor product is the matrix made by exhausting all product combinations, right? That is $v_x \cdot v_x, v_x \cdot v_y, v_y \cdot v_x, v_y \cdot v_y$.
What would you call the operator that then maps that matrix into $\mathbb{R}$ via the natural operation $$v_x \cdot v_x + v_x \cdot v_y + v_y \cdot v_x + v_y \cdot v_y$$ ?
Makogan
the operator that multiplies from the left by 1^T and from the right by 1?
where 1 is a column of ones
but is there no name for it? Like "outer tensor product"? i.e. for the operation that takes $v$ and maps it to the sum of the components of the outer product.
Makogan
it's not a linear operator
$T(x) = \bold{1}^\top x x^\top \bold{1} = (\bold{1}^\top x)^2$
Transparent Elemental
I guess if consider T(A) = C^T AC for orthogonal C then it's a conjugation in a sense, but that's about it
So... why exactly is the fourier basis the set of eigenvectors of the laplace operator? :p
I see, thank you lots.
It might help to start by thinking in 1D. The laplace operator in 1D is just the second order derivative d^2/dx^2.
But in that setting the eigenfunctions will be exponentials. I think you only start getting the Fourier series in the complex plane
Via Euler's formula
Have you considered the boundary conditions?
For instance, consider the homogeneous Dirichlet boundary conditions.
Oh I see, thank you
Does anyone an approximate range for how much the solution of an IVP may be sped up by using Parareal with a lot of processors?
I know it'll vary a lot depending on the exact DE and processing power, which is why I'm asking for an approximate range
Compared to a normal, say, 8th order RK
Hi, given two matrixes A and B, I want to compute A^-1 * B in a numerically stable and efficient way. With LAPACK (e.g. the cuSOLVER or MKL libraries), would it be better to use getrf + getrs, gesv, or something else?
how do i ge tthe upper bound here
for the error im so confused
i dont udnerstand how they are finding a bound on the error
looking at classes i have to take soon.. subjective question but did you guys find numerical analysis harder than cal 2/3?
Read the documentation to interpret the subroutine or function names. d double precision. ge General matrix. trf triangular factorization. trs is possibly solve using a given LU factorization. gesv should be solving a general matrix Ax=b for A. Check docs for exactly what things do.
you want answer or way to solve?
I know the value of A inverse i just want to know compute A^(-3)
Look up how to invert a 2 by 2 matrix, it's just a formula
this channel is for numerical analysis
use one of those
but for a 2 by 2 matrix, the formula for inversion can be found by a quick google search
hey you want help you must use help command, now coming back to your question, to find inv(A), find det(A) and Adjoint(A) and then inv(A)=adj(A)/det(A)
again use caley hemilton to find A^(-3)
Got it
Thanku
never mind
i guessed you solved it already
I want to approximate erf(1-z/(sqrt(z)) - erf(1+z/sqrt(z)) is there any ideas?
arent there exact asymptotics
how can i do q10
i assume i have to use the area approx by trapezoid as f, the treat the f'(x) as f'' in the centre formula
What are the issues with exact real arithmetic involving streams/buffers/T2E?
You should be able to accelerate it using your CPU's or GPU's Floating-Point Unit.
It seems like it could be useful if you don't know how to estimate the numerical stability of an algorithm. Then just use exact real arithmetic and forget all about it.
I think half of any numerical analysis textbook is filled with random inequalities, which are supposed to help you estimate forward/backward error of an algorithm. When you estimate the F/B error of an algorithm wrong, floating point gives you no reliable indicators that you made a mistake. It seems not very scalable.
Anybody have any pointers?
It seems like you can at least use exact real arithmetic as a fallback when you don't know the numerical stability of an algorithm. You pay a price speedwise compared to using pure floats, but you gain a very easy correctness guarantee instead of shrugging your shoulders about it.
you waited 19 minutes. be patient: people have lives
what is exact real arithmetic involving streams/buffers/T2E?
A bunch of concurrent processes writing to buffers. A buffer contains successive approximations to a real number.
Reading and writing are blocking operations.
There's something here: https://wiki.haskell.org/Exact_real_arithmetic
dead links and a msc from 98. yeah that is just nonsense...
https://www.pure.ed.ac.uk/ws/portalfiles/portal/12417914/download_2.pdf -- This is a good intro
Everything here: https://en.wikipedia.org/wiki/Computable_analysis
In mathematics and computer science, computable analysis is the study of mathematical analysis from the perspective of computability theory. It is concerned with the parts of real analysis and functional analysis that can be carried out in a computable manner. The field is closely related to constructive analysis and numerical analysis.
A notab...
The Turing machine model in the Wiki article works roughly the way I describe
And this: https://www.paultaylor.eu/ASD/
crank index 10/10
Where do you see crank-level stuff?
The author of the ASD web page is a bit idiosyncratic, but he's well-respected by colleagues
Or are you referring to myself?
Any clarification? Thanks
the first links you provided were not convincing at all, this last one seems to be from reputable people.
have a hard time to see the problem with just using normal high precision arithmetics
it is relevant for formal verification, for example
ensuring correctness reduces the chance of coq outputting a false negative
but the last ref seems to use it for creating a minimal cad system...which seems excessive
well this doesnt seem like a big area lol
and maybe that is a function of its practitioners not focusing on well-motivated problems
you could try posting in #foundations @ruby pike as there may be people who read that channel who can speak to the implementation of these things in formal verification contexts
perhaps the implementations in those contexts do not actually run fast?
Does anyone know how you might go about implementing Monte Carlo integration on parametric functions
Trying to use it to approximate the area of this banana
They seem like a placeholder. Lower down, it seems 'k' appears where a dot might appear. Full context is unclear so this interpretation could be wrong.
All floating-point values in a finite number of bits are rational numbers so real numbers with an infinite, non-repeating decimal expansion are not really relevant.
How big are the matrices? Are they sparse or dense?
what
The matrixes are about 5,000x1,5000, and have a typical sparsity of 70%, but it depends
[Assuming FP64, 8 bytes * (5000*15000)^2 = 4.5e16 bytes = 45 petabytes so I think you are unlikely to be able to find the actual target matrix as a dense matrix and store it.] Mistake, for an unknown reason, it was assumed N was given for an NxN matrix.
Well, there was a typo, as I meant to write 1,500, not 1,5000. But regardless, you definitely don't square the # elements in the matrix to estimate memory consumption
Apologies, mistake probably due to doing this for square HPL matrices.
Is A not a square matrix?
Does anyone have a link to a proof of the uniqueness of Hermite interpolation polynomial?
I know I'm supposed to write the system of equations, and then show that its homogeneous one has more zeroes than it should
But I just can't get it
surely its trivial as they are linearly independent?
you could also try using fundamental theorem of algebra
I can't use that they're linearly independent because that's what I'm trying to prove
Yes, that's what I want to do. The homogenous system has one more root than it should have
But I don't really know how to write it properly
it's trivial no? each hermite polynomial has a different degree
What?
I have to prove the uniqueness of the polynomial
I mean you have a set of polynomials, the Hermite polynomials, each hermite polynomial is linearly independent.
As we know there is one unique interpolating polynomial
and the hermite basis is linearly independent and forms a basis of polynomials, then uniqueness follows
I'm reading Verner's thesis, and his definition of a general system of differential equations is defined as follows (plus some more stuff)
Does anyone know what the r=1(1)q on the right in the middle could mean? P.s. those might also be "l"s, 1 and l looks identical in that document :/
It’s an index
I asked chatGPT
He claims it’s supposed to signify a range from 1 to r with step size 1
On an unrelated side note, does anyone know from experience what kind of speed up one could expect for solving systems of differential equations numerically using some standard method like 8th-order-RK-Fehlberg vs the result of a deep dive into numerical analysis and using some more complicated method (assuming you’re trying to solve with the same error tolerance both times)?
I know the speedup would vary greatly depending on a large number of factors, I’m just looking for an approximate order of magnitude to decide whether it would be worth it for me to learn a bunch of numerical analysis to use those more complicated methods
i am trying to find the minimum of this function
$$f(x)=\rho(\lVert x\rVert_2,\lambda)+\frac{\theta}{2}\lVert\beta_1-\beta_2-x\rVert_2^2+\alpha^T(x-\beta_1+\beta_2)$$
where $\rho(\lVert x\rVert_2,\lambda)=\lambda\int_0^{\lVert x\rVert_2}\left(1-\frac{t}{\tau\lambda}\right)_+dt$.
My answer to this problem is
$$\theta\left(\frac{\lambda}{\lVert x\rVert_2\theta}-\frac{1+\tau\theta}{\tau\theta}}\right)x=\theta(\beta_1-\beta_2-\alpha/\theta)$$
夢叶
Compile Error! Click the
reaction for more information.
(You may edit your message to recompile.)
@edgy flax let's move this to #optimization since it's fairly long to explain
Posted
can someone help with what this dot means in this context
the original shwarz method has simply a y
placeholder basically
usually just means any input
like people do $(\cdot,\cdot)$ to represent inner product of two arbitrary functions/vectors
ansh
thanks so much been tweeking forever over that
i dont understand how to find the number of iterations to acheive 10^-4 using the theorem
Have you found k?
It is not difficult to find k since g'(x) is decreasing in [1/3, 1]
Then if you look at (2.5), max{p0 - a, b - p0} = max{p0 - 1/3, 1 - p0} <= 2/3
So we have the bound |p_n - p| <= (2/3)k^n
That means we need to find n such that (2/3)k^n <= 10^(-4)
yeah i did that
i did the formula 2.6
g(1/3)^n/1-g(1/3) * |p1-p0| < 10^-4
ik its g`(x) discord notation doesnt let me write it
then i solved and i got 14
k should be independent of x
which idk if it is correct
Since you got a number greater than 14
That means you need 15 iterations to guarantee the error within 10^(-4)
I used equation (2.5) and arrived at the same conclusion
for eqn 2.5 how did u do it
k^n < 10^-4?
i wasnt sure what this max part was
are u sure?
i get this
so shouldnt n be 16 if we use top equation
k = 2^(-1/3) ln 2
whats k in this case?
Since |g'(x)| = |-2^(-x) ln 2| = 2^(-x) ln 2 is decreasing in [1/3, 1], it attains its maximum at x = 1/3.
So k = |g'(1/3)| = 2^(-1/3) ln 2
(2/3)k^n <= 10^(-4) implies n >= 14.734...
Hey, can someone help me
How do I prove, that for $f(x)=\exp(-x)$ the fixed point iterations $x_{i+1}=f(x_i)$ converges to the fixed point for all $x_0\in (0.4,0.8)$?
Nande
it's derivative in absolute value is less than 1
can sm1 explain what func g(x) i use for the fixed point method
im kind've lost
i tried g(x) = 25/x^3
not sure waht else i can do can any1 help?
oops I forgot about the Banach fixed-point theorem, ty
so you want an equation that has $\sqrt[3]{25}$ as a root, for instance: $x^3-25=0$. Then just use the newton raphson method because its a type of fixed point iteration method.
Max
you could create other fixed point iterations by diving both sides by x and making one of the x's the subject: [x^2-25x^{-1}=0\Rightarrow x^2=25x^{-1}\Rightarrow x=5x^{-1/2}]
Max
essentially the problem has become finding where these two curves intersect:
@eternal bronze
Has anyone done any work approximating (the products of) large (unitary) matrices by hashing their basis vectors to a much smaller dimension and summing the elements sent to the same hash bin to get a smaller dimension matrix? Is there some other known method for dimension reducing unitary matrices?
Hey, would it be appropriate to ask about Gauss Seidel iterative method in this channel ?
in any case, I listed my question here :
#help-3 message
Can u send the problem here
Is there a reason why PCA wouldn’t work?
I guess I should have specified that I'm multiplying lots of unitary matrices together, but my concern is that numerical problems would show up at the level of one multiplication because 1) you appear to lose unitarity and 2) attempting to invert matrices leads to a huge range of determinants (so you should avoid it) and 3) I was concerned that the hash functions might fail for cherry-picked unitary matrices.
It's Principal Component Analysis.
I thought about using Factor Analysis and Gaussian Belief Propagation to solve the whole network, but for that to be tractable the dimension of the unitary matrices would have to be brought down first, hence the hash functions.
Hi!
I am interested in Finite Element Exterior Calculus (FEEC), which is a finite element theory that it formulated in terms of differential forms and exterior calculus.
I'm programming a FEEC library for solving PDEs in rust, that supports arbitrary spatial dimensions.
If you know anybody about one of these topics, then come talk to me about it :)
I guess you have seen https://github.com/eschnett/DDF.jl
can someone teach me basic maths for 10th grade im bad at maths i want help😭
Sure what do you need ?
What are the conditions of convergence for the secant method ?
this is not the place. pre-university
Like addition subtraction
Like I'm so bad
I won't forget u ever in my life if u could teach me😭
You are in the wrong place.
Basic arithmetic belongs in the #prealg-and-algebra channel.
And you'll have a lot better chance of helpful answer if you can ask a concrete question about something that confuses you, instead of just saying "teach me" in general.
Short query: All of these convergence at least quadratically, don't they?
indeed
you can always check using the definition
I think $z_k$ will be a bit more tricky algebraically but if the other two converge quadratically then c) definetly does
Max
One can just compute $\left|\frac{z_{k+1} - 1}{(z_k - 1)^2}\right|$. I don't think its algebraically complicated.
Himari
Yeah true 🤣 mb
so i was using secant method
to trace the contour
and was wondering why it diverges to -inf
quick question: if finite element methods belong to math category specially numerical analysis? Seems nobody talk about them here
From what I heard better than finite difference. But I don't see a lot of classes and math books on them
braess or brenner and scott are two good books
it seems more taught in mechanical engineering rather than standard applied math classes
that would depend on your university
personally I have never taken any applied course, only analysis courses on FEM
and if FEM is better than FDM is up to you and your definition of "better" 🙂
i have a set of notes on them if you’re interested, it’s from a mathematics course rather than engineering
Hello.
I'm trying to use the ILU decomposition as a smoother in multigrid for the anisotropic equation. In picture 1, the PDE is given as well as the finite difference operator in stencil notation. This is a 5-point stencil.
The ILU decomposition I want to use is the 7-point North-East decomposition(Ordering of the unknown grid points is column-wise bottom to top, left to right) as shown in picture 2. The stars are possible non-zero elements.
Lh = (Lh^)(Uh^) - Rh.
My question is, how can I find the coefficients in the stars?
Pictures 3-4 given for extra info.
Book: Ulrich Trottenberg, Multigrid.
hey thanks for pointing me to this again! only now i have realized that this is an actual arbitrary dimension FEEC implementation in julia! This is great to know of.
when playing around with stochastic gradient decent, i noticed that having data that is too big, our update rule will blow up
does anyone know where you can read about this?
theta_{i+1} = theta_{i} - alpha * sum_j(h(x) - y)*x_i
x ranging from -100 to 100 ^
x ranging from -10 to 10
same number of datapoints and same alpha
have you tried decreasing the step size?
and also plot what happens here before it goes to 1e307, the error graph isn't informative when it's obscured by large values
(also get used to using log scale when plotting any kind of errors)
hahah alright
yes it gave really weird plots:
well it's diverging, so try decreasing it some more
in general stochastic gradient diverges unless you tune it right
https://gits-15.sys.kth.se/jonathla/ML my code if it helps
alright
do you know where you read about this?
cant find it in my numerical method book
you just need to find analysis of SGD somewhere, it's in a lot of places
the basic result is that it doesn't converge to a minimum if you use fixed step size, only to some neighborhood of it after which it oscillates
and even regular gradient descent diverges if your step size is too large
right okay thank you
do you know any analysis how to know if you have hit a local minimum and not the global minimum ?
should you just try random starting points and hope for the best?
if you know the function is strictly convex you know you're there (in the global minimum)
otherwise you don't know anything
okok!
Heyo! I'm not from this field, so please forgive my silly question 
I need to do some plotting to check some things, and I'm stuck. I have a 3d function f(x,y,z), and I managed to plot its contour surfaces using mayavi:
import numpy as np
import mayavi.mlab as my
rang = 2
num = 100
f = np.mgrid[0:0:num*1j,0:0:num*1j,0:0:num*1j][0]
for i,x in enumerate(np.linspace(-rang,rang,num=num)):
for j,y in enumerate(np.linspace(-rang,rang,num=num)):
for k,z in enumerate(np.linspace(-rang,rang,num=num)):
f[i][j][k] = my_functionfun(x,y,z)
my.contour3d(*np.mgrid[-rang:rang:num*1j,-rang:rang:num*1j,-rang:rang:num*1j],f,contours=10,transparent=True)```
I thought this would help me all right, but it can be quite messy with the function I'm interested in. What I really want to plot is the set of critical points of my function. I can somehow guess it by looking at the contour surfaces, but it's not sufficient. Would someone know of a way?
I can approximate (the norm of) its gradient by finite differences, but then I'd need to find all the vanishing points of that new function... Trying to plot the contour surfaces of that function results in something unusable (indeed, the 0-level set is not going to be a surface, but rather a curve and/or a discrete set...). Thanks for helping a newbie!
N.B.: I'm not especially looking for actual code, just a method is fine! I just don't know what to do!
its easy to plot the gradient
Would you care to elaborate? 
you just plot the gradient and see where it's zero
Yeah; but how do you plot a 3d function except in 4-space?
it's a function that assigns every point a 3-dimensional vector
it's easy to plot
Hello everyone,
I'm currently working on a homework problem where we need to approximate the natural logarithm using an iteration method described by B.C. Carlson. The method involves iterative computation of arithmetic and geometric means, and seems quite interesting.
The iteration method is detailed in the paper:
B.C. Carlson: An Algorithm for Computing Logarithms and Arctangents, MathComp. 26 (118), 1972 pp. 543-549. DOI:10.1090/S0025-5718-1972-0307438-2.
Here's a summary of what I need to do:
- Initialize \( a_0 = \frac{(1+x)}{2} \) and \( g_0 = \sqrt{x} \).
- Iteratively compute \( a_{i+1} = \frac{a_i + g_i}{2} \) and \( g_{i+1} = \sqrt{a_{i+1} \cdot g_i} \).
- Use \( \frac{x-1}{a_i} \) as an approximation to \( \ln(x) \).```
The task requires writing a function `approx_ln(x, n)` that uses \( n \) iterations of this algorithm to approximate the natural logarithm \( \ln(x) \).
I'm a bit stuck on how to implement this in code, especially handling the iterations and ensuring the accuracy of the approximation. Could someone provide guidance or a starting point for this implementation? Any help would be greatly appreciated!
just write a for loop for n iterations
Care to elaborate?
@plucky kayak Using this?
for _ in range(n):
This is what I had in mind:
def approx_ln(x, n):
# Ensure the input x is greater than 0
if x <= 0:
raise ValueError("x must be greater than 0")
# Initialization
a = (1 + x) / 2 # a_0
g = math.sqrt(x) # g_0
# Iterate n times
for _ in range(n):
a_next = (a + g) / 2 # Compute a_{i+1}
g = math.sqrt(a_next * g) # Compute g_{i+1}
a = a_next # Update a to a_{i+1}
# Approximation of ln(x)
ln_x_approx = (x - 1) / a
return ln_x_approx
# Example usage
x = 5
n = 10
print(f"Approximation of ln({x}) with {n} iterations: {approx_ln(x, n)}")
yes and it depends on what the input x is supposed to represent
either it's the starting point or the number of which log should be approximated
This is the first task of my homework assignment, but I don't know if I interpreted it correctly in my code:
Does anyone know some good reading for minimax approximation. I have been trying to understand it using Sirca: https://link.springer.com/book/10.1007/978-3-319-78619-3 but I am unable to understand how to implement it.
I don't want to look at the code / check for you as a good numerical analyst needs to learn how to check their own code.
My suggestion is using known values such as ln2, ln3 ext. and use different values for n to see if your algorithm produces an accurate result
Perhaps a good exercise would be plotting the error as n increases
You should also be able to use this algorithm to plot ln(x) (although it may take a while)
This is what I wrote, is this what you had in mind?
import numpy as np
import matplotlib.pyplot as plt
def approx_ln(x, n):
if x <= 0:
raise ValueError("x must be greater than 0")
# Initial values
a_i = (1 + x) / 2
g_i = np.sqrt(x)
# Iterative calculation
for _ in range(n):
a_next = (a_i + g_i) / 2
g_next = np.sqrt(a_next * g_i)
a_i, g_i = a_next, g_next
# Final approximation
ln_approx = (x - 1) / a_i
return ln_approx
# Generate a range of x values
x_values = np.linspace(0.1, 5, 400)
# Compute the true ln(x) values
true_ln_values = np.log(x_values)
# Different values of n for approximation
n_values = [1, 2, 5, 10]
# Create subplots
fig, axes = plt.subplots(2, 1, figsize=(10, 12))
# Plot ln(x) and approx_ln(x) for different n values
for n in n_values:
approx_ln_values = [approx_ln(x, n) for x in x_values]
axes[0].plot(x_values, approx_ln_values, label=f'approx_ln, n={n}')
axes[0].plot(x_values, true_ln_values, label='ln(x)', color='black', linestyle='--')
axes[0].set_title('True ln(x) and Approximations')
axes[0].set_xlabel('x')
axes[0].set_ylabel('ln(x)')
axes[0].legend()
axes[0].grid(True)
# Plot the difference between ln(x) and approx_ln(x)
for n in n_values:
approx_ln_values = [approx_ln(x, n) for x in x_values]
difference = true_ln_values - approx_ln_values
axes[1].plot(x_values, difference, label=f'n={n}')
axes[1].set_title('Difference between ln(x) and approx_ln(x)')
axes[1].set_xlabel('x')
axes[1].set_ylabel('Difference')
axes[1].legend()
axes[1].grid(True)
# Show plots
plt.tight_layout()
plt.show()
Does the plot make sense? (Perhaps send what matplotlib shows)
Sure, one sec
This seems to suggest that the more iterations you've, the greater the chances are that the lines are coincidental (not sure if this is the right word).
Yeah well the error seems to decrease as you increase n
Yes, and as n = 10, the difference is 0 (?)
You could use integration to find a numerical value of the error
Almost 0, can't possibly be 0 due to floating point precision
Is this a reasonable procedure for this:
- Define a function to compute the absolute error between the true ln (x) and the approximate ln (x).
- Use numerical integration to compute the integral of the absolute error over a specified range of x values.
- Plot the integrated error as a function of n.
For the first step, I was thinking
def error_function(x, n):
true_value = np.log(x)
approx_value = approx_ln(x, n)
return abs(true_value - approx_value)
which has a straightforward purpose, right.
But then we want to numerically integrate this absolute error, abs(true_value - approx_value), so we would need to invoke the error_function in some kind of loop?
Neew to Python syntax, I've mostly used java before 😄
Yeah that's a good procedure
there is no reason to introduce a_next and g_next and why create the variable ln_approx
no reason to introduce a_next and g_next
why create the variable ln_approx
Why not?
they serve no purpose
Did you read the homework I had?
Consider $\frac{x-1}{a_i}$ as an approximation to $\ln (x)$.
ln_approx = (x - 1) / a_i
return (x - 1) / a_i
As for g_next and a_next, I suppose I could just write
for _ in range(n):
a_i = (a_i + g_i) / 2
g_i = np.sqrt(a_i * g_i)
Instead of
for _ in range(n):
a_next = (a_i + g_i) / 2
g_next = np.sqrt(a_next * g_i)
a_i, g_i = a_next, g_next
yes
Yes this is true, thanks. I won't need ln_approx later, I think 🤔
We'll see down the line, there are some tasks left.
also I would rename error_function to error_estimate or something like that since "error function" is something else https://en.wikipedia.org/wiki/Error_function
Trying to use the multi-parameter (3 parameter) Jacobi smoother on the 2D Convection Diffusion Equation with a O(h^2) discretization.
For the smoothing factor to be minimized, I have to minimize { maximum over the high frequency region [ | (1 - w3 * dinv * z )( (1 - w2* dinv * z ) (1 - w1 * dinv * z ) | ] }, where z varies over the high frequency region. I can give the expression for that as well. Factoring out w3w2w1 and simplifying, we have:
min { max over the high frequency region [ | (t - t1)(t - t2)(t - t3) | ] }. I need to find the optimal parameters t_i = 1/w_i ( they need to be real ) so that min-max is obtained. t = dinv * z ranges over some complex domain, which is shown in the picture. If t was real, then it would've been simpler.
So my question is, how do I find the optimal parameters t_i so that min-max is obtained?
If t was real, say a <= t <= b, we use x_i = [cos(2 * i - 1)pi/6] ( zeros of the Chebyshev polynomials in [-1,1] ) and then get w_i = 2/[ (b - a) * x_i + ( b + a)]. Tried doing a similar thing with my complex t, took the absolute minimum and maximum to get a,b, calculated x_i and took these x_i from [-1,1] to [a,b] but I am not getting the correct results.
Sorry for the long story, I thought it would be better if I explained what I'm doing. In short, just need to find ( real ) optimal parameters x1,x2,x3 so that min-max of |(z-x1)(z-x2)(z-x3)| is obtained, where z is complex.
in numerical analysis offical term, this is "Absolute Error". So name function absolute_error
I found this at https://math.stackexchange.com/questions/236364/colleague-matrix and was wondering, how to prove that. i guess one has to show, that the characteristic polynomial and p(x) are the same? the polynomial is in chebyshev basis.
Induction probably
There is a similar matrix called a companion matrix and the equivalent proof for a companion matrices eigenvalues being the roots of the polynomial is doable using induction
(this was my proof, although probably done before, from my dissertation)
ahh thank you, yeah induction makes sense
hi guys! I have a test on friday and this is eating me up. I have this differential equation to laplace transform: dy²/dt² + 3dy/dt + 2y(t) = 5u(t). My teacher solved it and got (-s²-s+5)/s³+3s²+2s however, I can't get around how on earth he got that numerator. I did get the same denominator though
the numerator that I can only come up with is 5+2s
this is not the channel. please go to #odes-and-pdes
Does the concept of upper/lower hemicontinuity ever get used in the numerical analysis literature?
The application is to decide whether a low forward/backward error algorithm is in principle possible for a problem.
Even if it is used, would it be considered obscure?
How do you rigorously show that $\sum_{n=1}^{\infty}((1+\frac{1}{n})^{n} -e)$ diverges?
CJ_:)
Better to take this to #real-complex-analysis
But I imagine this is pretty hard
Where did u see the problem?
Hey can someone help me implement this (task 4)?
Do I need this for Task 4?
a_0 = (1 + x) / 2
g_0 = np.sqrt(x)
This is what I've currently:
# Task 4: Write fast_approx_ln(x, n) as a faster converging function for approximating a natural log.
def fast_approx_ln(x, n):
if n < 1 or x < 0:
raise ValueError("n must be greater than 1, x must be greater than 0")
a_0 = (1 + x) / 2
g_0 = np.sqrt(x)
a = [a_0]
g = [g_0]
for i in n:
d_i = a_i
Where I think I will need to have initialized a_0, g_0 before? Then, somewhere, store elements of a_i in an array.
$(1 + 1/n)^{n} - e = \Theta(1/n)$ doesnt it
L
$n \log(1 + 1/n) = n(0 + 1/n + c/n^2 + O(n^{-3})) = 1 + c/n + O(n^{-2})$
L
You want to repeat what you did before and find $a_i$ and $g_i$, where we only find $g_i$ for the purpose of finding later $a_i$'s.
Then once we have $a_i$ just use the recurrence relation for $d_{k,i}$ to find $d_{i,i}$
Max
Is anyone here familiar with semi-discretization in the context of delay differential equations
Can you help me understand this formula
(For $i=0, \ldots, n:)
[d_{0, i}=a_i
d_{k, i}=\frac{d_{k-1, i}-4^{-k} d_{k-1, i-1}}{1-4^{-k}}, \quad k=1, \ldots, i for i>0]
And what it means by “accelerating the convergence”?
dghf
Compile Error! Click the
reaction for more information.
(You may edit your message to recompile.)
Replace $k=1,2,3,4$ ext into the formula (write it out completely)
Max
Isn't it derived by canceling some terms in the asymptotic expansion of the error
does anyone know how difficult a grad course in numerical methods for pdes covering these topics would be for an undergrad?
Accuracy, stability, and convergence of finite difference discretizations of partial differential equations, numerical dispersion and dissipation, Fourier and Chebyshev spectral methods, boundary conditions, coordinate mapping, collocation methods, fast multipole methods, applications
i have a background in numerical analysis/numerical linear algebra, unconstrained optimization, and numerical methods for odes (mostly RK and multistep methods)
do you have background in multivariable calculus or PDEs
multivariable yeah and i haven’t taken a pdes course yet but i’ve done stuff with fourier analysis and am currently working on some stuff with the helmholtz equation
by multivariable calc do you mean like calc 3/vector calc or more like calculus of variations and multivariable analysis?
both I guess. I would think you need some theoretical PDE background to handle a numerical PDE course. It seems that you have some at least.
well, you definitely need vector calc for the Gauss-Green formulas
yeah ideally i’d want to take pdes beforehand but the numerical pde course is rarely offered and analytic pdes isn’t offered until next spring
i understand the basic notion of like a sobolev space though
anything from calculus of variations since that’s probably my weakest area
that can be adressed ad hoc I think
Sobolev spaces and Fourier analysis are more fundamental
also do you think any of this would show up?
Introduction to techniques used in solving problems arising in a variety of physical settings. Sturm-Liouville problems and Green’s functions. Methods of solution for the wave, heat and Laplace equations. Variational principles.
primarily green’s functions since idk much about that
tho the num methods for pdes course only requires a numerical lin alg/analysis/optimization course and a course on quadrature/RK/finite difference/method of lienes
Sorry for late response. I just thought of it while thinking about e and infinite series, especially things that go to 0 like 1/n
Issue is that we don’t have a base d value that we can find later d values for. Or is there something I’m missing?
Hello people, I'd like to know if it is better to apply a numerical method to the PDE on u, or to apply it to the analytical answer of the equivalent PDE on w. Any help would be appreciated.
I've already emailed the author of the lecture notes, where I found theses equations, but still no answer.
not sure if this is the right place to ask
but suppose I want to approximate e^x
and for some reason, I want such an approximation to have the weierstrass function property of being cont. everywhere but differential nowhere
how would I accomplish this?
The Weierstrass function f is bounded, so just do e^x + εf
are there any explicit solving algorithms that are more suited for stiff problems or is it all qualitatively similar and implicit methods are necessary
here you have some discussion https://docs.sciml.ai/DiffEqDocs/stable/solvers/ode_solve/
Documentation for DifferentialEquations.jl.
many thanks
would someone be able to look at my code for solving the Helmholtz equation in 2D? Im trying to compare the Born approximation to the series solution for a penetrable circle centered at x0 and there's some mismatch between the two
Could someone help me get started with the first task please? 🙂
This is how far I've gotten 🤦:
class Interval:
def __init__(…)
I'm pretty rusty with classes but I'm guessing the idea is:
Create a way of storing intervals (list?)
Implement a function for each of the operations they detailed for instance for add:
def sum(i1,i2):
# Copy one of the intervals, only for the structure
i3= [0,0]
i3[0]=i1[0]+i2[0]
i3[1]=i1[1]+i2[1]
return i3
Also when you create an interval you need to initialise a and b (the endpoints)
@heady mesa I don't know what the theory is even covering, like what am I looking at, it's just a bunch of jargon.
And I am referring to the math related stuff, what in it do I need to understand to tackle the first task, it's just a bunch of information, would help if it was broken down.
Have you ever done questions like:
what does that say? is this related to physics?
do i need to understand physics for this problem?
Nope this is from GCSE maths exam in the UK
Essentially with different schemes in numerical analysis you will be able to quantify the "accuracy" of what you found using an interval.
For instance the root of $x^2-2=0$ is in the interval $[1,1.5]$.
In the example I sent, as we round the data, or potentially our measuring instruments only have that accuracy then we can create intervals for each measurement:
[g\in [12.65,12.75)]
[h\in[9.2935,9.2945)]
You can then use these facts to create a bound for f. This is useful because often when you take measurements or do calculations you have limited accuracy and you'd like to know how much accuracy you have in the thing you care about (in this case f)
Max
@heady mesa What's our f here? And how do you even get the range to begin with ([12.65, 12.75), [9.2935, 9.2945)), what's the usefulness in having this range when we have mathematical tools that already can give us a precise value for something?
@heady mesa So I basically just read everything over and I am still having difficulty understanding what they want us to do in the first task. What relation does the theory have to what they want us to implement?
Basically they want you to implement the functions they spoke about
How does
two real numbers representing the left and right endpoints respectively
figure in here from the theory?
As part of the class
The two real numbers describe the interval
ie. Interval(a,b)=[a,b]
So you could store the interval as a,b
So… they want a catch-all function inside of the class Interval that will cover all of these?
I tried asking chatgpt and it came up with an interpretation which leads me to believe this is the implementation they're looking for? From the text:
For even $n>0$ we have to distinguish between three cases:
$$
[a, b]^n=\left{\begin{array}{l}
{\left[a^n, b^n\right] \quad a \geq 0,} \
{\left[b^n, a^n\right] \quad b<0,} \
{\left[0, \max \left(a^n, b^n\right)\right], \quad \text { otherwise. }}
\end{array}\right.
$$
Be aware that powers of intervals are not repeated products, i.e. $[a, b]^2 \neq[a, b] *[a, b]$.
dghf
Do you know what a python class is?
Yes
I am not saying the LaTeX above is the python implementation if you think so
is the implementation as easy as just
class Interval:
def __init__(self, leftEndpoint, rightEndpoint):
self.leftEndpoint = leftEndpoint
self.rightEndpoint = rightEndpoint
def __repr__(self):
return f"[{self.leftEndpoint, self.rightEndpoint}]"
Yeah, make sure left is less than right
now that i come to think of how a interval works, [a, b] implies that a is smaller than b, so i would need a condition to check if a is smaller than b
The first one is just creating the structure of an interval in python
Great minds think alike
let me change the variable name to sth shorter, and you can give me feedback based on that
class Interval:
def __init__(self, left, right):
if left >= right:
self.left = right
self.right = left
else:
self.left = left
self.right = right
def __repr__(self):
return f"[{self.left, self.right}]"
here i was thinking that if [left, right] has left as greater than right, then it means leftshould be at the right. otherwise, if left is not greater, i keep the order as is.
thoughts?
Yep nice
As for this task, do they want me to provide the methods inside of the class Interval?
I'm a bit rusty of classes but yes (if you can define an operation between 2 objects of the same class)
This is not python OOP channel tho.
But for your question. you can implement operator overloading methods such as add, mul. sub, eq etc.
For addition, it's union two intervals. You can implement method add(self, another) and operator overloading add for the same effects
For multiple, it's cartesian product,
For substract, im not sure either, probably not a good one.
For equal, check whether two intervals are equal.
I believe this is about operator overloading. So best bet for you is to check list of overloading methods and select four that makes sense
Tbf it's clearly for a numerical analysis module
Most of numerical analysis is actually programming, hence the study of floating point numbers being so central
I understand classes arent really maths tho 😂 and tbh id just use a list lmao
anyone know of any good free (or cheap) online courses/references for using hpc/parallelized systems for solving pdes
i have access to a cluster but no one in my institution really does that kind of work with pdes
This is my implementation, is it correct?
def __add__(self, other):
return Interval(self.left + other.left, self.right + other.right)
def __sub__(self, other):
return Interval(self.left - other.right, self.right - other.left)
def __mul__(self, other):
products = [self.left * other.left, self.left * other.right, self.right * other.left, self.right * other.right]
return Interval(min(products), max(products))
def __truediv__(self, other):
if other.left <= 0 <= other.right:
raise ValueError("Division by an interval containing zero is undefined.")
quotients = [self.left / other.left, self.left / other.right, self.right / other.left, self.right / other.right]
return Interval(min(quotients), max(quotients))
(…] you can define an operation between 2 objects of the same class
Is that what I did withselfandotherabove?
Test it on some examples, we aren't here to check your homework :/
What's wrong with getting help with HW? I would ask for help on the python server if they had a channel for numerical analysis, but alas.
Test it on some examples
I did and it gave me expected results, but there could usually be a way to optimize a code, or there is something I am missing.
this is not numerical analysis. this is questions regarding syntax of a programming language
no this is really not about numerical analysis.
It's python specifically OOP, operator overloading.
Notice the word "module" meaning this is from a module on numerical analysis, I didn't say it was numerical analysis
in practice, how many iterations of the QR algorithm should you do for the eignevector deocomposition? Or said otherwise, how do you know you have converged?
Lots but depends on the type of matrix
For instance on Hessenberg matrices this algorithm converges really quickly
if you do qr on an upper hessenberg matrix H, a simple way to determine convergence is you can just check H_{n, n-1} and see if it is below a specified tolerance, then deflate and repeat
Sup?
Sorry for this picture, latex wasn't working properly. Anyways, my question is below.
I am using a 7-point difference operator (actually a 5-point difference operator with b, f = 0) to discretize the 2D model anisotropic problem.
L_h in stencil notation is given.
Where a, b, c, d, q, f, g are constants, Nx = Ny and I = Nx - 1.
The grid is (0,1)^2 $ and the ordering of the grid points is in the North-East direction. That is, column-wise bottom to top, left to right. The matrix formed by L_h is A. Now, I am trying the 7-point North-East ILU decomposition for A.
A = LD^{-1}U - R, where L, D, U all have the same diagonal and R is the error matrix.
Now LD^{-1}U comes out to be weird, at least for me.
Keeping in mind that we only need to solve for alpha, beta, gamma, delta, mu, zeta, eta, we equate some of the (non-constant) D_i to the corresponding elements of the matrix A, we get the following:
For k outside the range [1, I^2], elements are to be treated as zeros.
The main question I have is, how do we solve this recursion? The recursion for delta is the most confusing as we need the next delta to solve for the previous one and so on.
Sorry if the question is long, I've tried to explain what I'm doing and how I'm getting everything. Thank you.
(Note: I've tried the East-North ILU decomposition the same way and the recursion is solvable)
I'm currently researching adaptive stepsize methods, and I wanted to ask the following:
Many methods seem to use approximately this formula to calculate/recalculate the step-length
Which makes plenty of sense to me
My question is whether there's an agreed-upon standard for what exactly the local error tolerance T is set to
Currently I'm just using $\dfrac{\text{global error tolerance} \times h_n}{\text{length of integration inveral}}$
Aepeiron
Which, according to some heuristic arguments of mine, should make it quite likely that the global error is (approximately) within the global error tolerance
However, I've also seen people use the DE's current state to calculate T
And I'm not sure of what the advantages of the one or the other would be and which one is generally better
i did some research in this area for linear multistep methods
we used Milne’s estimate of the local truncation error and then a step size update based on that
Does anyone know a root finder for systems of nonlinear equations that has guaranteed (or very likely) convergence?
NR doesnt converge for one of my unknowns, and a generalized bisection method is unreasonably complicated to implement
Doesn't converge or don't have a close enough initial start point?
Perhaps perturbing the Newton Raphson method, I've seen it before where we people convert the NR algorithm to:
[
x_{n+1}=x_n-\alpha \frac{f(x_n)}{f'(x_n)},
]
then using a value of $\alpha$ that is close but not equal to 1, but I've never actually tried this myself.
Alternatively secant method?
Max
Specifically, I am writing an implicit solver for a 6 dimensional stiff problem, all of the variables converge very quickly except one which explodes within one step. Usually, the best initial guess for the next state of the solution is your current position but apparently not here, I could of course play with it to get a better guess like you suggested, but its not very feasible to do it at every step until it works considering there's going to be billions of them, each one requiring multiple NR iterations for the solution, and then a good "guessing" algorithm on top of that to make it work at all (if it even ends up working). It's becoming way too inefficient computationally.
I was looking into the generalized secant method and might implement it, I'm just not sure if its going to converge more consistently and I'd like to avoid making unnecessary detours.
Non linear equations in many dimensions are so annoying, NR is of course the king but as far as I'm aware there is no good way of picking a start point other than looking at physical observations/ guessing.
There are definitely methods but if you can i would look at alternatives. I did a project last year on a differential equation solver and the first method was converting a non linear ODE into a set of non linear equations, because solving non linear equations was so bad I completely scrapped that part of my project and focused on avoiding them.
truly wish i could do that
Im dealing with an state dependent delay differential equation which is highly nonlinear and even the delay is defined only implicitly so no readily availible solvers can handle it and I have to cook something up myself
Other alternatives include a stabilized runge kutta method which sounds like an even bigger nightmare to implement 😐
making a good root finder is probably the most feasible solution right now
this is known as damped newton
can't you pretend your problem is 1D, wrt the bad variable?
and in each step fix it and solve for the rest
then for the 1D problem you can use whatever robust thing you wish
e.g. you could even use bisection if you know a point where this is positive and a point where it is negative
does anyone have any suggestions on an interesting personal project related to parallel programming/high performance computing for pdes that i could try to implement
doesn't have to be original or research related, just something that i could work on in my free time
or more generally something mathematical that involves parallel/hpc
but how am going to know that the rest of the variables was solved correctly
in the most trivial example lets say i have to equations involving, the first one being x+y = 0 while the second one is something very ugly for y that i can't handle, if I just fix the initial guess for say y = 10e6 then i can obviously solve for x but it does not mean its correct
or am i misunderstanding
you mentioned that the other converge just fine
e.g. assume you need to find the roots of f(x,y1,...,yn) = 0
if you fix x = xi then g_{xi}(y1,...,yn) = f(xi, y1,...,yn)
so you can apply e.g. a few steps of Newton on y1,...yn for a given xi
with that procedure the y1,...,yn would supposedly converge to something
so you can now ignore those
and optimize w.r.t. xi
the idea is to treat your problem as 1D problem in the nasty variable
e.g. use bisection on it
which is guaranteed to converge provided your function is continuous and you start with a bracket where the function is positive and where the function is negative
so apply Newton only partially
I understand what you're trying to say but I am not sure it would give to correct solution
becaues the nasty variable is involved in almost all of the equations
I could fix it and solve the roots of the equations
but once i start using the bisection to find the nasty variable, it wont be fixed anymore
the previous equations that I found the roots of will be modified and the roots won't be correct most likely
best case I can imagine is having to alternate between fixing the variable and solving the roots, then using bisection for the last variable, and then going back to correcting the roots
I'll be meeting a professor who specializes in differential equations today so i'll ask him for opinions on the issue
that's the point yes
if you know what the nasty variable is, you fix it, you use newton for the rest, then once those become stationary you have a function value f(xi, y_1^*, ..., y_n^*)
you may have to modify this so it you minimize |f(xi, y_1, ..., y_n)| instead, as otherwise there is no guarantee that for the fixed xi there is a root, albeit maybe Netwon would still converge in the other variables
are you solving some elliptic non-linear PDE?
if yes, then maybe you can frame it as a prabolic one
Guys, how to increase the speed of this code using numba
%%time
@njit
def find_roots(coefficients):
result = np.roots(coefficients)
return result
@njit
def func(num_polynomials,degree):
result = np.zeros((num_polynomials,degree),dtype=np.complex128)
for i in range(num_polynomials):
coefficients = np.random.rand(degree+1) + 1j*np.random.rand(degree+1)
result[i] = find_roots(coefficients)
return result
num_polynomials = 1000000
degree = 10
result = func(num_polynomials,degree)
print(result)
I am not sure where to ask this
and treat the nasty variable with a specific integration technique
If this isn't the correct server. Plese let me know where I should ask this
alright now i understand what you were saying
i actually might try to implement that
state dependent delay differential equations
heavily nonlinear of course
no available solvers can handle an implicit delay, the problem can become stiff for explicit solvers, and NR didnt converge for an implicit solver
so this might very well be my next step
cuz the next alternative is a corrected runge kutta and i am really not thrilled by that idea
well, bisection should always converge provided your function is continuous and you have a bracket with a positive and a negative endpoint, the question is really whether Newton would converge for the rest of the variables once you fix the nasty one
also, what the other user mentioned - you can use damped Newton
I will try to implement your tips, many thanks for the help !
the problem with damped Newton is that the step would affect all variables, but maybe that is necessary
finally, if you have some parameter that you know makes the problem stiff
you can phase it in slowlyu
e.g. assume I have a parameter alpha, and my problem has alpha = 1 but is stiff, and for alpha=0 I have a non-stiff problem, but it is not the one I wish to solve
then you can solve several problems starting at alpha = 0 and progressing to alpha=1
and use the solution of the previous problem as initialization for the next one
Hi, is it true that in the worst case, the conjugate gradient algorithm needs 20 or 21 iterations to converge for the minimization of a quadratic equation of dimension 20? Let's suppose it won't find the minimum before 20 or 21 for this quadratic equation.
it actually is parameter dependent
but the issue is i am interested in the stability properties with respect to that parameter
ive already obtained a stability chart (using semidiscretization on an associated linearized system), but i need to validate it with a brute force time domain simulation
im gonna have to vary another parameter in the process as well and perform a lot of simulations for the different parameter sets, i wont really have the computational budget to vary the parameters so slowly and reference previous solutions as well on top of that
actually the stiffness depends on the combination of both of these parameters
yeah, im in an unfavourable position with respect to available techniques
woulda called quits by now if i didnt need it for a conference paper
then you can approach the stiff problem along a curve that goes from non-stiff to stiff, I would suppose that you can get a good initialization from a problem with parameters that are close by
mathematically yes, in the presence of floating point error - not necessarily
but a 20x20 matrix seems small enough
so unless it has a terrible condition number I would expect CG to converge in 20 (21) or fewer iterations
depends on how clustered the eigenvalues are
the way to think of CG is as solving the following minimization problem at step k:
$\min_{x\in \mathcal{K}_k(A,r_0)}\frac{1}{2}x^TAx - x^Tb + c, \quad \mathcal{K}_k(A,r_0) = {r_0, Ar_0,\ldots, A^{k-1}r_0}$
criver
where r0 = b- A x0, where x0 is the initial guess (if it is zero, then r0 = b)
So the Krylov subspace in which you minimize the problem grows at each iteration
and eventually it gives you the whole span of A
initially the CG algorithm was devised as a direct method actually
I didn't mention it yet, but the stiffness is hard to quantify, some solutions look reasonable (the physical interpretation is very dubious but it doesn't diverge) but with sufficient magnification it is obviously nonsense, and it might not start up until a certain point
and even still, I don't think my technical skills adequate for your plan lol
if you can compute solutions for the cases where your parameters do not break the problem, then you can take these solutions as initializations for slightly offset parameters
I have done this for nonlinear PDEs
where e.g. I know that some parameter makes my problem troublesome
then I solve the problem for some nice parameter, then use that as initialization for when I slightly tweak the parameter and so on
in a parabolic setting this can also be achieved by phasing in the nasty parameter slowly
e.g. at each step it can be tweaked a bit
I am really not sure how I could implement this for an SD-DDE
because slight modifications could cause drastic qualitative differences
not might, will
does the solution continuously depend on the parameters?
then it's still fine, it may just be that the tweak needs to be too small to be useful in practice though
like close by parameters should produce close by solutions under some notion of closeness
if not then it isn't exactly a well-posed problem
still though, what information from a given parameter set would be useful for the next one, that is the point that I can't wrap my head around right now
what information from the solution*
I mean, say I have the nonlinear PDE A(theta)(u) = 0, where theta are some parameters
I am not really interested in exact numbers even, it might as well be 1 significant digit of accuracy, the main concern is the stability
pick a theta0 that is nice enough, i.e. you can use NR to find a solution u0
now tweak theta0 a little bit to theta1
use u0 as an initialisation of NR
and solve A(theta1)(u) = 0
and so on
you can create a path between theta0 and your target theta*
as long as the solutions are continuous in the parameters
then this should work
yeah ok i see what you are saying
the only issue with practicality may be that the useful step is too small between theta_i and theta_{i+1}
yeah, based on the physical interpretation i can already tell it wont be practical sadly
also you should not expect that all steps are equal, since the parameter space does not have to be homogeneous
I can imagine the step size would have to be multiple orders of magnitude smaller than the resolution im interested in
this has worked well for me, but I was solving an entirely different problem, so it could be that in your case it is not practical
other options are as mentioned - damped newton, applying bisection to the troublesome coordinate
yeah I will try to implement the bisection tomorrow
what's your solution like? a number? a 1d grid? a 2d grid? a mesh?
3 separate variable graphed against time
heres a concrete example
this is the derivative of one of the variables
looks acceptable
significantly magnified, still passable
zoomed into one region, absolute nonesense
this behaviour does not occur for different parameter sets
I was thinking of nonlinear multigrid for initialization, but if it's 3 variables that's inapplicable
its 3 of interest, 6 that describe the system
one should also note that at some point you may just run out of precision, I have had such graphs where I basically hit the limit of floating point precision
i.e. you should differentiate floating point error from error due to the method
thats actually a very fair point
the thing with the parameter phasing in that I mentioned: it worked great, until I ran into something like your graph, and it wasn't the method that was to blame, I was just running into the limits of double precision
but I can't tell just from that
it's just something to be aware of
yeah that makes a lot of sense
the main reason i know its a stiffness iseeu
is that once i further varry the parameters, it diverges very quickly into oblivion, much faster than what the physical interpretation would allow
I do not think a 10% parameter change would make a displacement go from 5mm to 10^56 meters in 5 seconds
regardless, this is important to note
I found this, idk whether it's relevant to you: https://arxiv.org/abs/2401.11247
I haven't dealt with DDEs specifically though
I will look into it, many thanks
does anyone have a good def on lipshits coontinous?
what is the name of the course you re taking so we know how best to answer what you want exactly to know about the lipschitz continuity?
^
Lots of different definitions in different spaces
I'm guessing it's a basics numerical analysis course and they are studying fixed point iteration methods?
In which case:
Interesting, so mathematically The kth iteration of conjugate gradient minimizes some residual over a k dimensional subspace, so if the system is n dimensional then the maximum number of iterations is n since it is minimizing the residual, and hence solving the system, over the entire space?
and n + 1 would be if there is a floating point error?
@scarlet spruce not necessarily 21 you mean?
This was the definition used in my course for numeric methods
Actually, I haven't seen any fundamentally different definitions for it
Yeah generally the only difference is that instead of a modulus you use a norm
Then it can be applied to general spaces
no the n+1 is just depending on how you count the iterations I guess, technically it should be n, but since you mentioned 21 I supposed you're counting the setup twice or something
floating point error can make those iterations many more I guess
or maybe even non-convergent if the condition number is too large
you can use a metric too
but for small matrices getting a condition number that's very large is not that likely
the steps could be as few as 1 though, if you have a matrix that has only one eigenvalue
expand |Ax-b|^2 in terms of dot products, then compute the Hessian
f is alpha-convex when H(f) -alpha I is positive semi-definite
where H(f) is the Hessian
sorry, im taking real analysis
I dont think that s accurate, that s the definition for strictly convex one, for a-strongly convex (a-convex), we need eigenvalues > 0 and f'' exists
What can you say about the eigenvalues of the Hessian matrix (H(f) or f'') if H(f) - alpha I is positive semi-definite?
This is interesting, I don't know
@grave spoke I also have a confusion, is it true that if the determinant of the Hessian matrix = 0, then f is convex, if the determinant is > 0, then it is strictly convex. If f'' exists, and all the eigenvalues are > 0, then we say that f is a-strongly convex?
Sometimes I read that for a convex function, we need the determinant of the Hessian matrix to be non-negative, but that s not accurate, right?
No, zero determinant for the Hessian matrix is not sufficient for convexity of the function. Positive determinant also does not imply strict convexity.
Can you elaborate more please
You should rather check whether the Hessian is positive (semi-)definite, which can be done by investigating the eigenvalues or all the principal minors of the matrix.
Ok, but then, how would I solve this problem?
What problem?
to determine alpha we need not to look at eigenvalues, because we can't determine them?
Well, compute the Hessian of the function in terms of a
I did
the result is
-4a 1
1 -4
the determinant is
16a -1
now if f is convex the determinant should be = 0, thus 16a - 1 = 0 => a = 1/16
but the first principal minor also should be greater than or equal to 0
from there I got my info (thanks a lot for the explanation!)
so then the answer should a >= 1/16
but what makes a strictly convex f?
which is not true, thus the function is not convex? in our class nothing was mentioned about the first principal minor
we were told that f is convex if it is monotically inceasing and f'' is non negative for all x
it turns out that you can't make them both non-negative for any choice of a
that's not true, f(x) = x^2 is convex but not monotonically increasing
maybe you mean f'
a >= 1/16 is expected to be correct
if f'' is non-negative, then f' is monotonically increasing
yes
but that is false, you can try to plot the function for the choice a = 1, which is greater than 1/16, but the graph won't look convex
yes
but with that incorrect criterion, it is expected to true
even though it s indeed false
the criterion is that all eigenvalues should be non-negative
or that all the principal minors (not only the determinant) are non-negative
so you should solve for a in eigenvalues?
just compute the eigenvalues as a function of a
what's stopping you
just replace inequality with strict inequality then
I use positive for strictly positive (> 0) and non-negative for greater than or equal to 0
Ok then
what did you mean by replacing inequality with strict one for strictly positive?
I see you not mentioning the rule of the determinant in this, why is that?
So:
convex: all eigenvalues >= 0
strictly convex: all eigenvalues > 0
a-strongly convex: f'' exists and all eigenvalues > 0
?
for the characterization, you need that f'' exist in all cases (otherwise how would you even get the eigenvalues)
That s correct indeed
a-strongly convex is if all eigenvalues are positive
the largest value possible for a would then be the smallest eigenvalue of the Hessian matrix
strictly convex does not require that all eigenvalues are positive
but if all eigenvalues are positive (strongly convex), then the function is guaranteed to be strictly convex, so it's a sufficient condition
Ok, but I am confused, what makes f strictly convex and f convex, if not all eigenvalues are positive as you stated above?
f(x) = x^4 is strictly convex (and thus convex) but not strongly convex (the eigenvalue of the Hessian matrix at the critical point is zero)
Yes
So:
convex: some eigenvalues >= 0
strictly convex: some eigenvalues > 0
a-strongly convex: all eigenvalues > 0
?
Like this
Ok
in the multivariate case, >= 0 means positive semi-definite, and > 0 means positive definite
which can be characterized by the eigenvalues or principal minors of the Hessian matrix
in particular, f if strongly convex if f''(x) - m is positive semi-definite for all x and some m > 0
this is what criver said in the beginning
I see now
?
This one has eigenvalues of λ_1 = -sqrt(4 a^2 - 8 a + 5) - 2 a - 2 and λ_2 = sqrt(4 a^2 - 8 a + 5) - 2 a - 2
λ_1 has no solution in R, and λ_12 has a solution for λ_1 >= 0 (because we want to see if it s convex)
that's convex but you can't make a statement about whether it's strictly convex or not just using the information of the eigenvalues of the Hessian matrix
Now this is interesting,
you need information about further derivatives if you want to show that it is indeed also strictly convex
f''?
f'''
(because it is a quadratic polynomial, it should be strictly convex if it's convex btw)
.
Now back to this
.
This one has eigenvalues of λ_1 = -sqrt(4 a^2 - 8 a + 5) - 2 a - 2 and λ_2 = sqrt(4 a^2 - 8 a + 5) - 2 a - 2
only λ_2 = sqrt(4 a^2 - 8 a + 5) - 2 a - 2 >= 0 has a solution in R
,w eigenvalues of [[-4a, 1], [1, -4]]
ok, just checking
Yes correct
λ_1 has no solution
λ_2 has, thus to check if f is convex we check λ_1 >= 0
which yields a <= 1/16
for f to be convex
we need to check first when 4 a^2 - 8 a + 5 is positive or negative
,w plot 4 x^2 - 8 x + 5
aah
ok, it's always positive (no further restrictions then)
what do you mean by having no solutions?
They exist
and it turns out that one of them is always negative
Yes
Ok
what do we need to check for f''' to find if it's strictly convex?
The problem is that the teacher marked a >= 1/16 as correct!
so he is basically saying it's convex
back to this one
so I have to expand in what sense?
the first non-zero derivative has to be an even order derivative and positive
well, either he made a mistake or he meant that -f is convex (or f is concave), which is true
How is the Hessian matrix of f related to the one of -f?
What do you mean exactly?
matrix wise or interpretation?
hessian matirx of -f would be the one that satisfies the conacvity
if f is convex
If f'' is the Hessian matrix of f, then -f'' is the Hessian matrix of -f
If f'' is negative semi-definite, then -f'' is positive semi-definite
Thank you, that should be helpful
Back to this
@grave spoke so I should expand that one, and then check for the eigenvalues of what?
Try to determine the Hessian matrix first
btw, what would be the suitable form to expand it to?
Can you prompt ,w as you did before?
what prompt would you give it, it s very handy
\begin{align*}
f(x) &= \norm{A x - b}_2^2 \\
&= (A x - b)^{\tp} (A x - b) \\
&= x^{\tp} A^{\tp} A x - x^{\tp} A^{\tp} b - b^{\tp} A x + b^{\tp} b
\end{align*}
Zanarcane
Compute the gradient of this function first
If you're not used to the matrix-vector formulation, you can also do it componentwise using the following expression:
\[f(x) = \sum_{i = 1}^{n} \parens*{\sum_{j = 1}^{n} a_{ij} x_j - b_i}^2\]
Zanarcane
Thanks
@grave spoke I have a question about sets
let's have for example this one
If I want to determine if it is convex, how should I approach that?
I am also interested in determining if a set is closed or open
By reading up on the definition of a convex set first
Yes
f the Hessian of f is positive definite, then the set defined by the inequality f(x,y)≤c is convex for any constant c
But I have >=
Times both sides by -1
But how would you treated with that -1
?
The set is the equally defined by [ -x-y\le -2]
Max
