#numerical-analysis
1 messages · Page 9 of 1
The Hessian of the LHS is just the 0 matrix which is trivially positive semi definitive
so it's convex
Does it mean this set is convex and concave?
Yes
I meant the definition is says that the f should be positive definite for the set to be convex
so the set is not convex?
It's neither according to this definition
We have showed it's SEMI POSITIVE definite
The definition asks for it to be POSITIVE definite, which it isn't
is there another defnition
Idk and tbh this isn't numerical analysis lol
The definition is fine, you just need to understand the difference between POSITIVE DEFINITE and SEMI POSITIVE DEFINITE for this question
Ok
the answer to the question turned out to be: no
i implemented it in the end
worked well for some time
but in the end NR got stuck in an infinite loop after solving a few of them correctly
a tried implementing bisection for another variable to see what would happen
only makes convergence slower but doesnt work
Finally decided to tell my supervisor im stuck and we need a change of plans
did you try minimizing |f(x,y)| instead?
if this gets stuck in a loop it would mean that you also need a good guess for the other variables, so that wasn't the only problematic variable
not sure how i would go about doing that robustly
I have no clue what my initial guesses will be
I mean, it will be the solution of the previous problem but I don't know anything about that
you just apply NR as a minimization method to to |f(x)|^2 instead to as a root finder for f(x) = 0
the classical option is as mentioned damped newton
tried that as well, did not help
wdym by "I tried that"
what did you do for the step?
I damped the steps with differing damping ratios between 0 and 1, I can't really get any information from the system to make more informed guesses, it always got stuck in a loop, it just took more time
you're not supposed to take random damping
say you have a direction d (not necessarily from Newton)
though im considering this
then you try to find alpha^k such that |f(x^k + alpha^k d^k)| is minimized
you can do so with backtracking like search or whatever
the other option is to go to something more robust like trust region methods
minus or plus?
dumb question but ive always seen it with a minus
though i suppose it would depend on k whether it matters
depends whether you consider d^k to be the descent direction or whether you consider -d^k to be the descent direction
e.g. d^k = -\nabla f(x^k) for steepest descent as I wrote it
you can fall back on gradient descent if Newton doesn't work
e.g. say that the Hessian is not positive definite
that's if you do the optimization version
in either case, try gradient descent, try damped Newton, try optimization Newton, etc
there are many things that can be tried before one gives up
it's also an opportunity to learn some methods for nonlinear equations/nonlinear optimization
I suppose you are right
ive already learnt quite a few techniques
id love to take my time with it but i need it to validate results for a conference paper, thats the only reason im rushing it
many thanks for the advice by the way, it was very helpful
Hi guys I want to ask a geninue question on how to optimize mathematical software. I read highly optimized library like LAPACK, scipy, for example, to quote Heath:
"The performance of today's microprocessor-based computer systems often depends critically
on judicious exploitation of a memory hierarchy (registers, cache, RAM, disk, etc.)"
Does it means if I want to develop latest, fastest scientific library, I need to drill down to manipulate the memory level?
If might sounds stupid but I'm thinking to develp my own software to out compete existing ones
If you want to compete on performance with existing highly optimized libraries, then surely you need to do the same kind of performance optimizations they do.
(Or, alternatively, have a totally new optimization idea that the existing libraries don't even use).
Being aware of memory hierarchies so you can write code that are friendly to them is definitely essential for performance as soon as you're operating on any appreciable amount of data.
You might not need to write code that specifically depends on the exact layout of the cache hierarchy for one particular machine (though that is something people do from time to time, to squeeze the last drops of performance out of their particular hardware, at the cost of all the work potentially being lost when they upgrade). But you do need to have an awareness of which kinds of algorithmic patterns play well together with caches.
wow thanks a lot. This seems like a huge topic. Is there a book or manual somewhere i can learn this kind of optimization?
I'm not aware of any book that presents it in an organized way. (This may say more about my awareness than about what exists, though).
but software like LAPACK, eigen, and scipy for example is very generic. It was not developed for specific hardware architecture. So there is a lot more specialized kind?
for example these ultra performance critical places, like hedge funds, NASA, they do not use generic library right? they do their own thing
Code can still be either friendly to cache hierarchies or not. If your memory accesses walk all over the address space at random, the code will be markedly slow on any modern hardware, compared to an implementation that makes an effort to keeping its memory accesses more clustered in general -- even if the latter is not tailored to one specific cache hierarchy.
they just hire companies to develop stuff https://www.nasa.gov/post2/ https://techport.nasa.gov/view/93669
yeah that's what I want. build that company 😄 Gonna study on. Thank you.
Hello! Does anyone know how the Block Symmetric Gauss Seidel Preconditioner $B$ for a linear system $Au = f$ is defined?
Because I know how the linear iterator of Block SGS works, and I know that if you write it in the form
$u^{k+1} = u^k + B(f - Au^k) $
Then you can use B as a preconditioner for the Conjugate Gradient (if B is spd that is...)
The problem is that I cannot put the iteration in that form, and I can't find anywhere an explicit definition of the Block SGS preconditioner (but even the non-symmetric one to begin with...).
If anyone is familiar with these topics and would like to give me a reference to some article/book I'll be grateful, thanks 🙏🏻
Mòrag • [HEAVENLY SPEAR] •
(btw I'm working with A spd of course, since I want to use PCG)
do you know how Gauss-Seidel works on a 2x2 system?
if yes, then now replace the elements in the 2x2 system with matrices, and apply Gauss-Seidel with those
$\begin{bmatrix} a_{11} & a_{12} \ a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} x_1 \ x_2 \end{bmatrix} = \begin{bmatrix} b_1 \ b_2 \end{bmatrix}, \quad x_1^1 = (a_{11})^{-1}(b_1-a_{12}x_2^0), ,, x_2^1 = (a_{22})^{-1}(- a_{21}x_1^1 + b_2)$
criver
now just imagine a11, a22, a12, a21 are block matrices
where a11, a22 must be square and invertible for (a11)^{-1} and (a22)^{-1} to be defined
x_2^0 is the initial guess
it probably makes sense to first study block Jacobi
I know how Jacobi and block Jacobi work, and the block Jacobi preconditioner for a matrix A is simply the block diagonal matrix obtained by A, where we invert the blocks
Like that, where A_1, ... , A_N are square blocks on the diagonal of A
So my guess seeing this is that the preconditioner B for Gauss Seidel would be the inverse of a Block Triangular matrix...?
yes, but you will not implement it like that
you will implement it by forward substitution
$X_i^{k+1} = (A_{ii})^{-1}\left(-\sum_{j=1}^{i-1}A_{ij}X_j^{k+1} + B_i - \sum_{j=i+1}^n A_{ij}X^k_j\right)$
criver
This looks like you actually have to solve for X_j^{k+1} for 1<=j<=i-1, but you don't since you have these from the previous steps
so you compute X_1^{k+1} using B_2, X_3^k, ..., X_n^k, then X_2^{k+1} using X_1^{k+1}, B_2, X_3^k, .., X_n^k, etc
it's similar to block Jacobi in the sense that you need to invert the blocks A_ii
it's also obviously more sequential than block Jacobi
as you need X_{1}^{k+1}, ..., X_{i-1}^{k+1} to compute X_i^{k+1}
To make Gauss-Seidel symmetric I guess you would do an additional pass backwards
@balmy frigate what are you using this for?
if it's a PDE maybe multigrid is better
or domain decomposition methods
yeah i had exactly the same issue when i implement the Gauss Seidel. So there is matrix equation. But if you just use that straight up it will be extremely slow due to inverting the matrix. NOt to mention conditional number of matrix and stability problem
You want to just use regular "programmer" way. Iteration
Matrix way is mathematically elegent but extremely inefficient. I did that in my class, it's order of magnitude slower than others
yes, you are obviously not supposed to compute A_{ii}^{-1} explicitly unless it's super small or easy to invert and store
typically the A_{ii}^{-1}v part is evaluated by using a solver
I'm trying to use multigrid as a preconditioner, I've got two questions. What should be the preconditioning matrix and what is an efficient way to find the multigrid iteration matrix? Trying to put all the e_i's through one multigrid cycle for all i would take a long time. MATLAB btw.
for what problem?
if it's a linear problem just use multigrid directly
no point making it a preconditioner
are you doing geometric multigrid?
Rotated convection diffusion. Multigrid convergence is satisfactory, but I want to confirm some results.
Yes
so I guess an initial boundary value problem
for what solver do you want to make this a preconditioner?
GMRES(m)
ok, that's fine I guess, since your matrix should not be symmetric due to the convection term
you know how preconditioning works in GMRES already? Purely in matrix notation?
you substitute said matrix with the multigrid iteration
it's up to you how many cycles, pre smoothing, post smoothing iterations, and what smoother you use, as well as what restriction and extension operators you use
so just implement it as an operator that acts on a vector
Do I use the multigrid iteration matrix as the preconditioning matrix or do we modify it a bit?
you are not supposed to form the matrix itself
you just perform the iteration
e.g. you should write a function
Oh.
where you pass in the vector
all the details for multigrid
and then use that
forming a matrix is stupid
it's redundant and slow
it's like this in lectures for simplicity
but you implement Pv in a matrix-free fashion
e.g. simplest is just using a few iteration of Jacobi
as a smoother
damped Jacobi
the downsampling can be as simple as averaging 4 neighbouring nodes in 2D
I suggest writing a proper multigrid solver before trying to make it a preconditioner
I am a bit confused though, in Trottenberg's Multigrid book, it is said that the preconditioning matrix is C = (I - MG)*Ah^(-1).
by proper multigrid I mean you don't form a multigrid matrix
I have that.
Was just creating the matrix for preconditioning.
I can't tell anything without context on the notation
I is the identity matrix, MG is the multigrid iteration matrix.
Ah is our matrix that comes from discretizing the PDE.
And that C is inverted when preconditioning
Should I take some screenshots and send them here?
Of the GMRES algorithm and the remark about the preconditioning?
look, say you have two levels h and H, h being the fine, H being the coarse
and you have A_h u_h = b_h
Yes
you start with some guess u_h^0
run some smoother iterations on it on the fine level
you get u_h^{1/3}
now compute the residual r_h^0 = b_h - A_h u_h^{1/3}
on the coarser level, solve A_H e_H^0 = R_H^h r_h^0
then once you have e_H^0
add in the correction: u_h^{2/3} = u_h^{1/3} E_h^H e_H^0
now run some post-stmoothing iterations on u_h^{2/3} to get u_h^1
and you repeat this process
for multi level just nest this
Yes, I understand
don't form any matrices, inverses or whatever else
the only thing you may need to form is A_h and A_H as sparse matrices
I understand, and I have code for multigrid cycles, they are not using the multigrid iteration matrix
R_H and E_h can be implemented as is
I was forming the matrix for one cycle of a multigrid for preconditioning
don't form any matrices
sure, go ahead
I can perform multigrid cycles, that's all good. How do I precondition with multigrid? In GMRES with preconditioning, there is one step where we have v_(k + 1) = AM^(-1)v_(k), where M^(-1) or M is supposed to be the preconditioning matrix.
Now without forming my multigrid iteration matrix, how can I perform this preconditioning?
Some confusion also comes from Trottenberg's book, where M = (I - MG)*A^(-1), MG is the multigrid iteration matrix
I haven't looked up specifically the GMRES preconditioner but in CG it's just s = Mr, so what you are solving here is A_h s = r with multigrid, that's it
so the right-hand side becomes the current CG residual
the unknown is s, what you solve for
you can get an initial guess for s with a full mulltigrid cycle
remember that the ideal preconditioner is A^{-1}
well here you are approximately solving A s = r
with multigrid
so M ~ A^{-1}
in Saad's book you have left and right versions for preconditioning
idk which works best in your case
but the idea about the preconditioner should be more or less the same between solver, up to requiring modifications for solvers that require symmetry, and flexible preconditioning
but the point is that if you have your multigrid solver code you can directly use it
the optimal parameters may not be the same for preconditioned iterations however, be aware of that
and preconditioned solvers are not necessarily faster than applying multigrid directly
I see, thank you. What do you mean by optimal parameters?
Saad also has stuff on preconditioned GMRES in his book
well you have to choose a bunch of things in multigrid
such as smoother, number of iterations in pre and post smoothing
cycle type
number of cycles
restriction and extension operators
Oh okay
there's no "one multigrid"
those things depend on the problem, and in your case also depend on the solver since you're using it as a preconditioner
what shows up as a right-hand side would depend on the solver you use
different solvers result in different types of residuals
e.g. CG can introduce high frequencies in the latter, while Jacobi smooth things out, so depending on what solver you apply this preconditioner to, you may have to tweak parameters for optimality
so it's not so easy
Very informative, thank you
you should look at a book for different variants of preconditioned GMRES
as mentioned there are left, right, split, flexible, etc
I'm using this for Machine Learning. I have a huge SPD linear systsm and I wanted to solve it via Preconditioned Conjugate Gradient, so I looked up in the literature what preconditioners one can use, and I found out that a linear iterator works also as a preconditioner, and by linear iterator i mean a matrix B that defines the linear iteration
$x^{k+1} = x^k + B(b - Ax^k)$
So I need to understand how this B is made, to apply conjugate Gradient to
$BAx = Bb$
Of course, I actually only need the action of B on a vector to use PCG, for example if B is given by block Jacobi with, say, 10 blocks, at each step I have to solve 10 linear systems
Mòrag • [HEAVENLY SPEAR] •
The problem is that I don't want to actually use the linear iterative method itself, but rather PCG, and I need to understand how the iterator (the B) is built. Then I don't compute any inverse explicitly, if I have to apply an inverse matrix to a vector I just solve the corresponding linear system
For context, these are the two propositions I have in mind
2.2 tells you that a good linear iterator works also as a preconditioner
2.3 tells you that, vice versa, a good preconditioner also works well as a linear iterator
you don't build B
whenever you read s = M^{-1}r in the PCG code
you are actually solving A s = r with the iterative preconditioner
so for instance if the preconditioner was k steps of Jacobi
you would just compute k steps of Jacobi on A s = r
if the preconditioner was Gauss-Seidel, you would apply several sweeps on A s = r
for PCG you need symmetric Gauss-Seidel typically
so a forward sweep followed by a backward sweep
Yes, to preserve symmetry
where does the SPD system arise from?
I suppose A is sparse
Alright, it's starting to make sense...
The problem is that the PCG solver I use requires to know the action of B on a vector
And I'm kinda using it as a black box
It's not! Great plot twist I know, but I actually have a 10^5 x 10^5 dense spd matrix to deal with
Kernel Ridge Regression
Combined with some Nystrom Projections to reduce the dimension
Otherwise it would be like a matrix of order 10^8 or smth
And unfortunate Kernel matrices are dense
your kernels have infinite support?
use something like a fast multipole method
Yup, I use a gaussian kernel
(10^5)^2 dense sounds not very nice
It is not, indeed
the Gaussian kernel decreases quickly enough, unless you made it super flat
The problem is that I have to make it flat since my data lives in high dimension, and if I take the variance too small, since the data is very far away, basically my matrix becomes the identity due to floating point arithmetic
but should far away points really be strongly coupled?
In practice I have to take sigma = 10^3
how do you store this thing in memory even?
That's a good point, the problem is that my data lives in dimension 784, and the closest point are like at a distance of 10^something
I already tried with smaller variance
Usually you don't store it directly in fact
You access it via kernel evaluations
(btw I think my matrix is of order 10^4, because I can store it with my 16gb ram)
But the problem is I want to test what preconditioners really works well for accelerating convergence
you can then try running cholmod on it instead, since it's dense anyways
I'm bounded to use PCG unfortunately
Cholmod is some modified version of Cholesky?
yeah, look up suitesparse
I think that for your problem probably directly using multigrid with domain decomposition would be fairly efficient
multigrid to aggregate kernels together
domain decomposition as a smoother
I will, thank you
I'm not familiar with multigrid methods unfortunately
this aggregation in multigrid is essentially something like a fast multipole method also
you could try domain decomposition with PCG
but that would not clear up the large scale interactions between blocks
which is problematic
Unfortunately I haven't heard of this thing either, but thanks for mentioning it, I'll check it up
it's like block Jacobi/block Gauss-Seidel, but with overlapping blocks
You can do it with overlapping blocks?
if you introduce extension and restriction operators that make sense, then yes
Don't you get like... Conflicts or something if one equations belongs to more blocks?
the simplest thing you can do is of course to just use a diagonal preconditioner
no, the solutions do not overlap themselves, it's the data that you use for those that overlaps
have you tried a diagonal preconditioner
Oh, so you will like solve overlapping systems and then just add upp the solutions (?)
another option is incomplete cholesky
Yes, which is basically Jacobi
but honestly for a 10^4 dense system I would just directly use cholesky
And block Jacobi as well
so cholmod
Yeah, the problem is that one day I'll have access to more powerful computers and bigger data, so I really have to find something that potentially works up to 10^6
then fast multipole methods
or multigrid + domain decomposition
CG as a bottom level solver for example
the thing with CG is that it will quickly get rid of the high frequencies of the error
but then it would stall on lower frequencies
which is why multigrid comes in handy or fast multipole methods
the domain decomposition stuff is mainly useful for breaking up the problem into smaller pieces which you can fit in cache on the GPU and work in parallel
I will look up some of the things you mentioned because I'm not totally familiar with terms like "frequencies of the error"
In the meanwhile, I can just say: thank you!
if you decompose your vector in the DFT basis
then some coefficients correspond to frequencies higher than the medium, and others lower
stuff like Jacobi would kill off high frequencies quickly
also stuff like CG, even though it can also be a rougher
the lower frequencies can be dealt with quicker if you consider a lower resolution of your problem
e.g. through multipole expansion, geometric multigrid, or algebraic multigrid
Criver, in the GMRES step, we have v_(k+1) = A M^(-1) v_k. For x = M^(-1) v_k, I'm solving Ax = v_k through one cycle of a multigrid with a random guess.
Then, for x_m = x_0 + M^(-1) V_m y_m, I'm doing the same thing. For x = M^(-1) V_m y_m, I'm solving Ax = V_m y_m through one cycle of a multigrid with a random guess. But convergence isn't occuring, relative residual stays near 1.
how to prove that its a 2nd order convergence :
for this explicit runge kutta 2 method
then probably a random guess with 1 cycle is not enough
that or you have a bug
Are there any explicit RK5(4) methods that have 6 function evaluations and the same-first-as-last property, such that they only actually require 5 function evaluations per step?
Hello, ive done measure theory and functional analysis (using Axler) but have not taken an intro course in numerical methods. Is there a good textbook to start with for such a background?
i like trefethen's numerical linear algebra
for numerical methods for diff eq leveque's finite difference method book is good
also convex optimization by boyde for more optimization stuff
Heath is the most comperhensive for intro at your level. It's more general intro to all aspects of numerical methods as oppose to specialized topic liek trefethen and boyde
I’ve come across some books on “Numerical Functional Analysis” and “Theoretical Numerical Analysis” but was wondering if such a book covers standard topics in an intro numerical methods course, just at greater depth
my understanding is that numerical analysis is a very broad field, but a standard numerical methods intro class deals with pretty elementary problems that don't necessarily require functional analysis
so maybe you can jump directly to more advanced topics, like finite elements
I don't know you guys but there should be order on how you learn things right? You can't just jump into advanced topics without even seeing an overview of it. Jumping into finite elements without finite difference is like learning Hamiltonian formalism without knowing newtonian mechanics
Quite absurd to be honest
I don't think Heath's book is not easy even for advanced mathematician with no background in NM
You can try it on. Guaranteed mind-blowingly good
there is a reason it's a golden standard
I know it’s standard to learn the basics before abstraction, but I was unsure if there are more rigorous introductions to the topic rather than starting with the standard material.
I’m a masters student and I find that courses that lack rigor/proofs (like a lot of stats I’ve done at my school) to be mundane and boring.
I’m trying to propose an independent study which would cover the standard numerical methods introduction (because it’s a course I need to earn my degree) but also apply some of the analysis I’ve learned (which would be of interest to some of the other students who want to study more analysis but have taken all the courses we offer)
My school uses Sauer
I see. I looked at Sauer very briefly, it's less advanced than Heath. It explicitly stated for undergrad. While Heath is for advanced undergrad, or grad in physics and engineering. I think you should take a look at Heath first. There is you know...Anna's arch...
You will be pleased with Heath
What I also like about Heath is it's presentation. One of the best modern math typeset
i have a question which is quite opinionated but i cannot find much online so looking for people with real experience- i have to choose between either numerical analysis or advanced linear algebra for my mathematics with a data science emphasis degree. which one of these would be more applicable for real life?
Depends, I think a specialism in Numerical analysis is probably not as useful but if you have had no experience in NA then I'd take the NA. Most importantly though, figure out which one YOU like the most, ie. read some texts / watch some videos / study some proofs. As if you do something you enjoy you will become a lot better at it. Also most jobs won't ask for specific math modules.
Assming you want to get into DS/ML for work? Can you share the topics covered for both classes?
I’m leaning towards mostly actuarial work but also possibly DS. I figured the DS emphasis would be more useful vs taking complex, real analysis and abstract algebra for my career.
Yes I’ll find some topics tomorrow in the syllabuses
Unsure of the math actuaries do/need to know of... not really sure then
Coudl someone please explain to me what on earth the operation in line 3 does? that's the hodge star, I assume applied to a 0 form, but I don't know what the exponent notation actually means.
guys, is there an written algorithm to check whether matrix is upper trianglar or not?
like formal pseudo code algorithm. preferably from a book. not something you wrote
for (i=0; i < n; ++i)
for (j=0; j < i; ++j)
if (a[i,j] != 0)
return false;
return true;
O(n^2) is the best in this case?
probably yes unless you have more structure in your matrix known apriori
@glad mortar do u know if it's possible to solve that problem without expanding into partial derivative form, just superimposing the grids and multiplying
and i just figured it out myself , sorry for ping 💀
this probably has to deal with the notion of a "regular level set" in differential topology. Namely, if you have a constraint f(x, y) = c, you need the gradient of f to be nonzero at every point that satisfies that constraint
The constraint there can be viewed as an equality $g(x, y) = 0$ so $K = {(x, y) \in \mathbb{R}^2 : g(x, y) = 0}$. The constraint is regular if the Jacbobian matrix $Dg(x, y)$ is surjective (i.e. full rank) for each $x, y \in K$. In this case the set $K$ is a smooth surface.
L
Here $Dg(x, y) = \begin{pmatrix} 1 & 1 \ 2x & -2 \ \end{pmatrix}$
L
Invertible when $-2 -2x \neq 0$, i.e. $x \neq 1$.
L
Since x = 1 isn't in K, this constraint set is regular. Actually $K = {(-1, 1/2)}$.
L
Clear, thanks, I got now why it s not regular on K
would it be a good idea to write a paper on how ml can change numerical analysis
Of course, up and coming field and lots of stuff that hasn't been talked about enough
Can anybody explain this problem a bit?
what is Euler's inequality?
What are the best methods for solving a very large system of highly coupled nonlinear ordinary differential equations? Just normal (say, ~4th order) implicit runge-kutta?
Given that I want good accuracy, but don't need really high accuracy (like, I want on the order of 5 accurate decimals, I don't need 10 or 15 accurate decimals)
I’d assume it’s stiff because it’s very large and very coupled, although I have yet to verify that assumption numerically
Suppose I have R^n equipped with an arbitrary inner product <x,y>= x'Qx, where Q>0 is a positive-definite matrix. Let V1 be a r-dim subspace of R^n, and let V2 be the orthogonal complement of V1 with respect to the inner product. In my case, I have a basis for V1 and I want to construct a basis for V2. What would be the fastest way (in regards to time complexity) I could do this?
Anyone can help with this?
what is Euler's equality?
This is it
your answer is in the picture you just sent surely?
the implication is that the if $x^*$ is a minimum then the inner product inequality holds
Max
Yes, but my question is about equality, there we have inequality
as you can see there we have open set in the question, whereas in the notes I have closed, so I don't know whether there are other theorems that describe the question
Yeah
Hello all, I'm trying to implement householder QR decomposition effectively in python. language is not important but this is a prototype. I don't think i understand completely
'''
import math
import numpy as np
def householder(a):
m, n = a.shape
for k in range(min(n, m)): # loop over columns
# compute Householder vector for current col
alpha = -math.copysign(1, a[k, k]) * np.linalg.norm(a[k:, k], 2)
e = np.zeros(max(n, m))
e[k] = 1
v = np.hstack((np.zeros(k), a[k:, k])) - alpha * e
beta = np.dot(v, v)
if beta == 0: # skip current column if it's already zero
continue
for j in range(k, n): # apply transformation to remaining submatrix
gamma = np.dot(v, a[:, j])
a[:, j] -= (2 * gamma / beta) * v
return a
'''
This is my very vanilla implementation based on book algorithm. Not a effection one. In the book (Heath 3.5.1) It mentioned "the product Q of the Householder transformations need not be explicitly formed. In most software for
this problem, R is stored in the upper triangle of the array originally containing A, while the nonzero portions of the Householder vectors vk required for forming the individual Householder transformations are stored in the (now zero) lower triangular portion of A. " and
"If Q is needed explicitly for some other reason, however, then it can be computed by
multiplying each Householder transformation in sequence times a matrix that is
initially the identity matrix I, but this computation will require additional storage."
My questions:
-
But if Q is not formed, how can i get Q^T * b, the right hand side? Since I need right hand side in order to solve this linear square problem right? (See example 3.8). In order to compute Q, I need to compute every H_1...H_N? It explicitly stated H is not needed.
-
Also how to optimize storage according to book suggestion? basically "store v_k in stored in lower triangular portion of A". How to implement this in code? It only listed vanilla algo.
- I believe my implentation is subpar, for example I formed standard unit vector for each iteration. Which means extra 1*n vector storage. What's a better way to do this?
'''
np.set_printoptions(precision=4)
householder unit tests
A = np.array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.],
[-1., 1., 0.],
[-1., 0., 1.],
[0., -1., 1.]])
print(householder(A))
'''
this is a quick unit test. it will produce example 3.8 [R, O] matrix, but i still need Q^T * b right hand side. Don't know how to get it
Is this too big of a questio to ask here? I know it's a lot.
1: You can try applying the QR decomposition to a slightly different starting matrix.
To see whats going on, what you're really doing with householder is forming the matrix R = HnHn-1...H1A. = Q^T A. What you want is HnHn-1...H1b.
What happens if you start with a matrix that vertically concatenates A and b together?
2: Think about what the vectors v_k look like. The first vector is v_1 has length m, so it effectively has m-1 free parameters (can you work out what v_1 is if you only know what v_1[2:] is?). In addition, you now have m-1 zeros in A[2:,1], which is untampered by further householder iterations. Try storing v_1 there. Similar for each other iteration
3: You are forming e_k each iteration. But you can just modify v inplace by doing something like
v[k] -= alpha
This is great. I will take a closer look and change implementation later. But for 1. do I still need to compute H matrix each iteration in order to get right hand side? I can't think of a way to get both R and rhs without computing H
You should not need to compute H each time, think about what is happening to the columns of A after each iteration and try starting with an "augmented" matrix [A, b]
ok i augmented, the results are almost correct. But not quite.
you see. compare to 3.8, $H_3 H_2 H_1 b$ above 3 rows are the same. but below 3 rows it's [376, -1200, -3414, 5, 3, 1]
yehuihe
above is my result, but below is not.
I simply changed all processing matrix a to augmented
'''
def householder(a, b):
augmented_a = np.concatenate((a, b.T), axis=1)
m, n = augmented_a.shape
for k in range(min(n, m)): # loop over columns
# compute Householder vector for current col
alpha = -math.copysign(1, augmented_a[k, k]) * np.linalg.norm(augmented_a[k:, k], 2)
v = np.hstack((np.zeros(k), augmented_a[k:, k]))
v[k] -= alpha
beta = np.dot(v, v)
if beta == 0: # skip current column if it's already zero
continue
for j in range(k, n): # apply transformation to remaining submatrix
gamma = np.dot(v, augmented_a[:, j])
augmented_a[:, j] -= (2 * gamma / beta) * v
return augmented_a
"""
I've implemented a less advanced algorithm for QR decomp in numpy and got lots of floating point errors so I switched for the slower mpmath
'''
householder unit tests
A = np.array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.],
[-1., 1., 0.],
[-1., 0., 1.],
[0., -1., 1.]])
b = np.array([[1237., 1941., 2417., 711., 1177., 475.]])
print(householder(A, b))
so you're almost correct, but try writing mathematically what your algorithm has outputted
En i think @still forum is on the right track but I just can't conceptually understand it yet. I get that each step applying H to both A and b in augmented matrix will transform them both at the same time.
Also matrix storage for v_k storaged in A is hard for me to understand
So your R matrix is given by H3H2H1A
And after 3 iterations you should have
[H3H2H1 A , H3H2H1 b]
because the H_n vector only depends on the nth column
and the nth column is the same for both the original matrix A and the augmented matrix [A, b]
but what you appear to be doing is an additional 4th iteration, which is making [A, b] upper triangular
do the same code but do one less iteration and see what happens
ok. Im just wondering, if you do augmented matrix, that's another copy, right? meaning whole another storage, seems very suboptimal. that's ok, i want to get it right for now
I wouldn't be too concerned about allocating for making an augmented matrix, the bulk of your time will be occuring in doing the householder loop
like, householder is something like $O(n^3, mn^2)$
jamiecjx
Yeah I know. I meant storage
can someone explian the process for this to me
∇^6 and ∇ is this
apparently this is the procedure
but i cant figure what is exactly done here
uhh, i don't know the context but it looks like some 2-dimensional convolution
yes it is 2d convolution stuff,but i am getting different answer 
which one is the correct one though
thats what i am trying to figure out 
shouldnt i-1,j+3 be -2 via convolution
that part is confusing me
everything else kinda makes sense i think
same for i, j+3 , i think it should be 12
convolution is sum over m of f(m)*g(n-m)
so the kernel is actually "transposed" if you know what i mean
before you multiply elementwise
wait so i have to transpose the one molecule everytime ?
if you mean the ∇ as molecule then yeah, before you multiply the values you have to transpose it
hmm ok, they didnt tell us that at all 
or used the term convolution
i will try it for other nodes and see
that's for convolution, the one without transposition is called cross-correlation iirc
though the terms are often mixed up, in convolutional neural networks they actually use cross-correlation contrary to the name
does all this term come in signal processing , they only told us in terms of Finite differene scheme
but this idea seems to work very well
damn this idea really helped me thank u, i will look into this myself
yeah i think all of them are used in signal processing, convolution is closely related to fourier transform by the convolution theorem
it can be applied to some problems in e.g. probability theory too
like finding pdf of sum of two random variables with known pdfs
finite difference can also be expressed in terms of convolutions so i see where it comes from
ye the finite difference approach takes forever but this is really quick
Can someone give me a general definition of what a "Terminal Expansion" is?
in what context did you hear of it
It is for a numerical analysis class...
that much i deduced myself 😅
though i never heard of that term, what type of algorithm/method does it refer to
Hi guys, I've developed an algorithm to find a solution for a equation system that models a Chebyshev mecanism. I've used two different initial points and I'm having a bad time understanding why both solutions converges and why they're different and which gives the more accurate answer. Can someone help me pls?
how does your algorithm work
hi! i'm looking at discontinous galerkin methods and i'm a little confused about two definitions of stability that are being thrown around --
one is just that the time derivative of the L2 energy seminorm is nonpositive. this makes sense to me as being what "stable" means
the other seems to be that we can bound this energy seminorm with an exponential (Ce^{at}).
1 -- are these equivalent? i don't really understand why being bounded with an exponential means that it's "stable", won't the energy still tend to infinity (there is no mention of the constant a being necessarily nonpositive)? i can't seem to find anything about this
2 -- i'm a little confused by my book's definition of "well-posedness" too, which seems to be that we can bound the norm (not the square of the norm, like the stability definition) by an exponential. why does this make sense to be "well-posed"
for reference i'm reading out of hesthavan/warburton ch 4.2
thanks so much!
Is this a good place to ask about numerical methods
Just got confirmation that it was so I’ll begin: I’m part of my honors physics program at my university and part of the program is that we have to do “research,” in the area of honors classes that we take every semester. I’m currently taking Honors Calculus I and the only thing I could think to research was some sort of either entirely new or improvement upon a pre-existing numerical method pertaining to integration.
I wanted to ask you guys if this was something feasible. I’ll be quite honest I’m mostly ignorant when it comes to the area of numerical methods. And I’ve only used some basic ones when making physics simulations.
Something I’m currently looking at is optimizing an already known numerical method of integration to be either quicker or more accurate via programming. I’m just looking for some input on the idea from people who are more knowledgeable in the area than me, and any advice or suggestions would be very much appreciated, thank you in advance.
You might want to post exactly the definitions you are working with
Because people would likeky not buy a book to help.someone on discord :p
hi guys, im implementing qr_iteration with shifts for both rayleigh quotion and wilkinson. I'm not so sure about wilkinson implementation cause there is no testing example. Can you check whether I'm correct or not? Rayleigh is simple and correct
`def qr_iteration_with_shifts(a, num_iterations, shift='rayleigh', verbose=False):
n = a.shape[0]
for k in range(num_iterations):
sigma = 0
if shift == 'rayleigh': # Choose shift sigma
sigma = a[n-1, n-1]
elif shift == 'wilkinson':
delta = (a[n-2, n-2] - a[n-1, n-1]) / 2
sign = math.copysign(1, delta) if delta != 0 else 1
sigma = a[n-1, n-1] - sign * a[n-2, n-1]2 / (abs(delta) + math.sqrt(delta2 + a[n-2, n-1]**2))
# TODO: change to my qr decomposition
q, r = np.linalg.qr(a - sigma * np.eye(n)) # normalize
a = r @ q + sigma * np.eye(n) # generate next matrix
# # TODO: pretty format
if verbose:
print(k + 1, a, sigma)
return a
It is not exactly correct to say that "stable" means that the solution remains bounded as $t \to \infty$
レナ
A simple example is to look at the initial value problem
$$
\frac{du}{dt} = u, \quad u(0) = 1.
$$
One can easily see that the exact solution goes to $\infty$ as $t \to \infty$. In fact, if your numerical method yields a bounded numerical solution, that means your numerical method is incorrect.
レナ
So, the stability estimate
$$
E(t) \leq Ce^{at} E(0),
$$
where $E$ is an energy norm, takes into account problems that have natural exponential growth.
レナ
For your second question, "well-posedness" here means in the Hadamard sense. A problem is well-posed if
- A solution exists
- The solution is unique
- The solution depends continuously on the given data
One can easily shown that for linear PDEs, if the energy $E(t)$ is chosen appropriately, the stability estimate $E(t) \leq Ce^{at} E(0)$ gives 2 and 3.
レナ
To learn more about well-posed PDE theory that is relevant to numerical methods, I suggest the book by Gustafsson, Kriess, and Oliger. The title of the book is "Time-Dependent Problems and Difference Methods."
thank you so much for the answers! i'll try taking a look
Question is in Dutch, but I suppose it won't be a problem for my question (if you want I can translate things tho). But for the weak formulation for the Finite Elements Method (FEM), we multiply both sides of the DE with a certain test function $\phi(x)$ over the domain $\Omega = (0, L)$ and follow this up by a Partial Integration (PI), now when doing the partial integration we'd get:
$$\left[ \phi(x) E(x) A(x) \frac{dy}{dx} \right]^L_0 - \int_0^L ...$$ But for the evaluation in L I know that will simply become $\beta \phi(L)$, but why is the evalution in x = 0 zero? Or how what did they do with that?
cedric_
Because $\phi(0) = 0$ since $\phi \in V$.
レナ
Ahh, is that always so? (couldn't find it in my syllabus somewhere tho)
nvm found it
thanks!
Not always. In this problem you have a dirichlet boundary condition at $x = 0$. So it is natural to choose test functions such that they vanish at $x = 0$ since we are not "testing" the variation of the problem at $x = 0$.
レナ
is this the lax-richtmyer theorem? or is it something else
nope, that is the definition of well-posedness by Hadamard
Ah, it just depends on how you define your test function?
Yes.
awesome, thanks a lot
you dropped an `
def qr_iteration_with_shifts(a, num_iterations, shift='rayleigh', verbose=False): n = a.shape[0]
oh thank you. finally. But I got this already.
So the question is: for the given two step scheme (boxed in red), show that the scheme is unconditionally consisten with order of two in both space and time. Now my question is: what's the reason they chose the second order Taylor expansion for $\rho_{n-1,j}$, $\rho_{n,j+1}$ and $\rho_{n,j-1}$? Because in another very similar question they chose the first order Taylor expansion. But unless I did something wrong to get my answer both of these give a different answer.
cedric_
just a guess but usually in numerical methods (at least for like RK, multistep, etc) that i’ve seen they choose a taylor expansion that results in the most cancellation
since usually you’re computing the order of the residual (e.g. U_True-U_approx) so you wanna take as many “similar” terms in U_true that you can to cancel out with the ones in U_approx
not necessarily taylor, but yes
ok, i will keep that in mind, thanks!
In practice, which methods are usually used to solve the nonlinear equations at each step of an implicit method?
I want to prove the claim that the matrix 2-norm is bounded by the frobenius norm. Here is my proof attempt. do you think my proof steps are rigorous and correct
For solving non linear equations you can use for example the Newton method or the fixed point method.
In practice, newtons method
While you have unstable initial points, the speed of newton is very useful
Its quicker to try a bunch of initial conditions and use a conditionally stable method that converges very quickly, vs an unconditionally stable.method that takes a lot longer to converge
Damn
Wouldn’t have expected such a simple method to be used in practice
Is it also used for larger systems of equations?
Like, in practice
Because it seems like randomly guessing a point from which you converge would be a lot more difficult there
it often just converges and there's no problem
I'm not sure if there's anything else to try for large nonlinear system (besides gradient descent, which isn't guaranteed to converge to a point which satisfies all equations anyway), most of the other methods I've heard of have pretty poor convergence too
Oh, ok
a) you would often have some idea of a proper initial guess. For example lets say you are solving a PDE numerically (i.e a system of odes solves over the grid at each time instance). You can use the previous state as your initial guess, as you are assuming that the system is smooth.
b) you can exploit the system structure in certain cases. For example, lets say you have a function f(x) =Ax-b, where x is huge, BUT A is tridiagonal. You can use that information to find a solution much much quicker than plain newton iterations.
b) is a bit of a stupid example, but there are ideas
Ok, that’s interesting, thanks
Any suggestions for resources that would describe this kind of stuff, i.e. how it’s actually done in practice when performance is necessary?
Uuuh nkt really its very application dependent
No it's not newtons method. It's hybird methods.
Appearently, a combination of safe Bisection and fast converging inverse quadratic interpolation. They are avoiding newton's method
I might be wrong, as I often does. But at least reference here seems pointing at a hybirding.
it only works for one dimensional problem though
the question concerns large systems
for one dimensional problem there's of course brent's method and ridder's method which are both safe and fast converging https://en.wikipedia.org/wiki/Ridders'_method
ah right i forgot in system of nonlinear has not safeguarded solution. Nice
So you’d also say newtons is used for larger systems?
Btw, which book is this?
there is a newton method for more dimension aswell right? or i'm i trippin?
yes
like x^(k+1) -> x^(k) + d with J(acobian) * d = - F?
it's the same formula x_{k+1} = x_k - F'(x_k)^{-1} * F(x_k)
Scientific Computing by Heath chapter 5. Also talked about systems
according to the book, systems are a lot more complicated and "no simple analogue of bisection in
one dimension that can provide a fail-safe hybrid method"
it discussed 4 methods : fixed-point iterations, newton's method. secant updating methods, Robust newton-like Methods
But seems no general consensus on which is industrial standard.
guess it's an open question
Isn’t newtons method always just linearly approximating both sides and setting them equal, then solving the linear system of equations?
yes
Ok, ty
Ok 👍
i have a question about stability of finite difference schemes: consider a linear pde of the form
$$Au=f \text{ in } \Omega$$
$$au=g \text{ on } \partial\Omega$$
for some linear operators $A,a$. Let $A^h$ represent the discrete form of the operator. I’ve seen stability for a fdm scheme to be defined as $$|(A^h)^{-1}|\leq C$$ for all h as h tends to 0 but I’ve also seen it defined as
$$|u|\leq C_1|f|+C_2|g|$$
ansh
are these two concepts effectively the same and the constants in the second form just depend on the inverse of the linear operators, i.e., the first definition of stability implies the second
or are they completely different things
Without boundary closures (or some treatment of boundary conditions) $A^h$ might not even be invertible.
レナ
But the numerical solution (if it exists) might still be stable in the sense of
$$
| u_h | \leq C_1 |f_h | + C_2 |g_h|.
$$
レナ
Question, does anyone here know how to handle repeated eigenvalues when doing arnoldi iterations.
can somebody help me with this von Neumann analysis question? help-forumvon Neumann stability analysis
Two possibilities. Look up deflation, and also block Arnoldi.
Thank you for the pointers, I will take a look
Have you tried deflation?
Well, I tried implementing algorithm 2 here:
https://slepc.upv.es/documentation/reports/str4.pdf
But I don;t understand how this is supposed to help. As an example if I use the identity matrix as the input, the krylov sequence is just v repeating.
Thus the span is a single vector. THus I don't see how this algorithm would generate new directions to produce an orthogonal basis for the eigenvector space.
I have an exam next tuesday on numerical analysis. I have to know how to do von Neumann stability analysis but we never did it in class and my professor is not answering my email. Am i doing this correctly? (ping me in response)
@magic cobalt I haven't done von-Neumann stability analysis myself but have looked at some books which go through it. Read those and maybe it'll help.
Numerical solution of differential equations - Zhilin Li,
Finite difference schemes and partial differential equations - John Strikwerda,
Computational partial differential equations using MATLAB - Jichun Li.
I done von Neumann, basically fourier expansion, easiler than spatial radius. This looks correct
alright thanks. Why is it so hard to find help for von Naumann analysis?😂 I ask people with a doctorate and they have never done it
I don't know. Maybe it's because of history. analysis spectral radius is standard stability analysis. Maybe von Naumann is newer. But interestingly both my numerical PDE and finance class teacher use von Naumaan instead of spectral radius. Surely easier
nah not newer, it was developed and coauthored by John von Naumaan in 1947. Idk why.
i'll take a look in the books. Thx!
bcs von Neumann analysis is too specific
i mean im a physics undergrad student
you need to know the specific form of your discrete operators and it only applies to linear and periodic problems
im suprised even applied math majors dont know what it is
oooh i didnt know that
well applied maths is more than just doing numerical PDEs
yeah but i mean thats a big part of it
i guess its something physicists use more
not really, i would say its a small part
i looked at the applied math degree at my uni and half the courses are about pde's
yeah i guess they dont have ot be numerical
when we make the ansatz $$ e^{ikn} $$ we are assuming that the problem is periodic alr
レナ
yeah thats true
then it only applies for linear problems
with constant coefficients, i think
the energy method is more mathematically solid
i guess thats why i learn it as a theoretical physics student. We assume periodic solutions 99% of the time😂
yeah
but ure dealing with nonlinear problems right?
in theoretical physics
depends
whenever we make an ansatz its always f(x)=Ae^(ikx) or something
schrodinger equation is an example
Then i suppose ure dealing with linearized problems
Assuming enough smoothness and regularity, we can study nonlinear problems by linearizing them
thats pretty much what we do yeah
but pendulums become non lineair at some point
if you dont lineairize it
but we alwats assume small angles and such
so its easier
can you do von neumann?
i have another question
i have been doing a few von neumann problems and they all went great but i couldnt do this one. i dont understand they go from $g_{1,2}=-\beta \pm \sqrt{\beta^2 + 1}$ to $|g_{2}(k)|>1$
Westvleteren XII
i just dont see it
$g_2 = -\beta - \sqrt{\beta^2 + 1}$.
レナ
ah yeah and $\beta$ cant be zero
Westvleteren XII
By elementary algebra it should be easy to show that
$$ \left| -\beta - \sqrt{\beta^2 + 1} \right| > 1 $$
for all $\beta > 0$.
レナ
To ask for mathematics help on this server, please open your own help channel or help thread. See #❓how-to-get-help for instructions.
Hi I need to numerically solve the eigenvalue problem to an integral equation
$\int K(x,y)\varphi(y)=\lambda \varphi(x)$
does anyone have a resource that explain the topic?
hrbibbi
is the central difference in time and space method for heat equation unstable? I can't seem to find many resources online on it and it doesn't work very well for me, though analytically it is consistent with order $\mathcal{O}(\tau^2+h^2)$
ansh
numerically it doesn't seem to work unless i have a very large mesh size in space and small sizing in time and even then, the heat flow is increasing rather than stabilizing to an equillbrium
ansh
where v(x) is compactly supported in [0,1]
central differences in both time and space for the heat equation is unstable for all values of $h$ and $\tau$
jamiecjx
perturb the solution by a fourier mode
and then solve for the perturbation to see if it grows or not
you should find that for all values of the critical parameter $r = \frac{\tau}{h^2}$ that the fourier perturbation grows
jamiecjx
yep, assume periodic BCs and do von neumann analysis. its unstable
just follow @still forum suggestion. try the ansatz
$$
u^n_{k \pm 1} = \hat{u}^n e^{ik(x \pm h)}.
$$
レナ
solving the recurrence relation for $\hat{u}^n$ will show that it is unstable.
レナ
I think the time stepping scheme needs to be changed. Since if we just discretize in space using central differences, the semi-discrete numerical approximation is provably stable using the discrete energy method (summation-by-parts).
yeah, from what I remember, first order in time is fine
explicit should be stable for $r=\frac{\tau}{h^2} < \frac 12$
jamiecjx
iirc we can use the same CFL condition and then use heun's scheme in time so we get an explicit second order both in time and space
I mean this seems like a pretty tricky task to do numerically
@still forum @sinful idol in von neumann analysis you end up having something like
$E_m(t+\tau)=E_m(t-\tau)+\lambda E_m(t)[sin^2(blabla)]$
where that first term is usually $E_m(t)$ so you can safely divide through and look at the error ratio
ansh
Compile Error! Click the
reaction for more information.
(You may edit your message to recompile.)
but in this case you end up having ratios between two steps so how do you bound that? (or fail to bound it to show instability of central difference in time)
This form implies that we have a recurrence relation in the form of
$$
\hat{u}^{n+1} = \hat{u}^{n-1} + \alpha \hat{u}^n,
$$
where $\alpha$ is independent of $n$. Try solving the recurrence relation.
レナ
i see that makes sense
thank you
oh it looks almost like fibonacci with some weight to it
Hi, someone can help with this numerical analysis problem? Especially part b)
what real analysis theorems do you know relating to derivatives of functions?
The way part b) is written is suggesting the use of a particular theorem
bigger hint spoilered ||mean value theorem||
i have a question, is there any integral that cannot be calculated numerically but can be calculated by laplace transforms?
i want to see if laplace transforms having any benefit for computational maths
I mean all integrals can be approximated if they converge, I'm guessing you mean can't be approximated to arbitrary accuracy? But then trapezium rule can do that but it's a bit expensive.
So I suppose you would be looking to numerically solve definite integrals? But these can be solved by rewriting the integral as a solution to an ODE.
[\int f(x)dx = I \Leftrightarrow \frac{dI}{dx}=f(x),]
To my knowledge I don't think there are any first order linear ODEs that cannot be solved very well?
Max
For Laplace transforms and Fourier transforms they are mostly used for PDEs... and hence double/triple integrals, I've never done the theory behind this but I'm sure someone else could point you in the right direction
Hello, I'm following the course 'CFD with high performance programming in python' and on the step 13 (cavity flow with Navier-Stokes), the pressure partial derivative $\frac{\partial p}{\partial x}$ is discretized with centered finite difference $\frac{\partial p}{\partial x} \approx \frac{p^n{i+1,j} - p^n{i-1,j}}{2 \Delta x}$. I tried replacing it with forward or backward differences and it doesn't work, but I don't really understand why we have to take the central finite difference. I guess that the central finite difference is more 'stable', but in that case, why don't we take it for the velocity derivative also ? If you can explain or have some ressources explaining finite differences intuitively and where/why I should use different formulations, I'd really appreciate it.
weareinthematrix
i feel like there should be non solveable integrals by numeric methods
probably not in the first order
I have the following system and I’m asked to do forward and backward substitution (in that order) but that doesn’t make sense to me. Wouldn’t you wanna do the reverse?\
$$\begin{bmatrix}
1&0&0\\frac{1}{2}&1&0\2&-3&1
\end{bmatrix}
\begin{bmatrix}
2&4&2\0&3&1\0&0&8
\end{bmatrix}
\begin{bmatrix}
x_1\x_2\x_3
\end{bmatrix}=
\begin{bmatrix}
-3\1\4
\end{bmatrix}$$
KySquared
You can write this matrix equation as [A(Bx)=c,]
If we let $y=Bx$, then you first want to solve:
[Ay=c,]
Then you can solve
[
Bx=y,
]
But to do that you do forward substitution in the first equation and backwards substitution in the second
Max
Uh I mean there are ofc hard to solve integrals but numerically.... unless there is a point that converges but is on the edge of not converging I guess we might encounter problems but nothing that can't be solved numerically tbh
I'm open to suggestions but there are lots of tricks for numerical integration
Hey I need help finding an appropriate stability condition to keep my 2nd order pde stable during numerical analysis can someone guide me towards ressources on that?
specifically for the damped wave equation
Anyone know why Gauss Legendre Quadrature for even functions on intervals of the form [-a,a] have a much larger error when compared to 2 times that on the interval [0,a]?
If you integrate on [0,a], and then double it to get the integral on [-a,a], that's effectively like using a quadrature with twice the order
The quadrature rule doesn't "know" that you are integrating an even function so it doesn't know that half of [-a,a] is [0,a]
Ah I see. Thanks.
i asked the following in a help channel, but it was recommended that i post this in the advanced pde channel (ive decided to repost it here):
hi, im a little new to writing pde solvers, and i was just wondering if it's common to have loads of oscillation when solving the advection equation using crank-nicolson + gauss-seidel. i've read that cn should be unconditionally stable for advection, but for large enough time step or fine enough special discretization my simulation blows up. are there better schemes for advection, or is a smaller time step/ more diffusion the best solutions?
i came up with a solution im happy with
for du/dt = -u div(u), im using cn and pretending the oscillations dont exist
and for everything else im using upwind
Is it valid to take logarithm/any other function of big O
Ex you have f(x) = C + O(x) can you then proceed to write log(f(x)) = log(C+O(h))
Is it possible to use Lagrangian/Newtonian Interpolation in a 3D space? Or what other methods are there I can use to find a differentiable, multivariable polynomial (or whatever other form) such that a series of (x,y,z) coordinates lie on that graph?
Newton's method for solving an equation f(x) = 0 generalizes to the case where f is a multivariable function R^n -> R^n, at least when the Jacobian of f is sufficiently far from being singular near the solution. Is there any similar generalization of the secant method for multivariable functions?
Generalized Secant Method for Simultaneous Nonlinear Systems originally credited to Wolfe and Bittner. Lesson shows how to solve nonlinear systems without the Jacobian, nor the need to approximate it, in a straightforward and visual manner. Example code on GitHub https://www.github.com/osveliz/numerical-veliz
Chapters
0:00 Intro
0:15 Prerequisi...
it should be pretty bad given that the order of convergence depends on the dimension
much like in gradient-free optimization algorithms
Ah, too bad, then.
And thanks, by the way!
I guess I should review the proof that the secant method in 1D converges with order phi.
Although it stands to reason that, if x_{n+2} is a symmetric function of x_n and x_{n+1}, and this function is moreover a linear extrapolation, then the order of convergence should have something to do with the positive solution of x^2 = x + 1.
(Fuzzy intuitive reasoning on my part. I'm not good with real analysis.)
The proof that I've seen that Newton's method in 1D converges quadratically depends on f(x) being of class C^2, with f'(x) not vanishing near the root. For the secant method, I guess it suffices to require f(x) to be of class C^1, with f'(x) Lipschitz and not vanishing near the root, right? Or do we still need f(x) to be of class C^2?
IIRC the secant method also requires $C^2$
レナ
Intuitively, one can see that the secant method approximates the derivative evaluation in the Newton's method using the forward finite difference
Would anyone know how to answer this? I put a b and e for my original answer but got it wrong
I was thinking that a could also be right
You can view quasi-Newton methods as a generalization of the secant method to the multi-variable case
Well, actually it is not always that Newton's method is faster than bisection method because of multiplicity of roots.
Hey, im playing around with scipy's 'solve_ivp' function so I don't have to write my own integrator with each project and before I move onto bigger things i'd ideally like to understand how it works / how to format the function signature?
$$\frac{d\theta}{dt}=\omega~ ; \quad \frac{d\omega}{dt}=-\frac{g}{l}\sin{\theta}.$$
$$\omega_{i+1}=\omega_i+-\frac{\Delta t \cdot g}{l}\sin\theta_i$$
$$\theta_{i+1}=\theta_i+\omega_{i+1} \cdot \Delta t$$
Clarkie
Here we first step to a new angular velocity before updating the theta, which is a physical approach, if we use an angular velocity at the moment i to compute the theta roc , the solution diverges from the 'analytical' one and goes coo coo,
def step_w(OMEGA_i , THETA_i , TIMESTEP , G = 9.81 , STRING_LENGTH = 10) -> float :
dw = -G/STRING_LENGTH * np.sin(THETA_i)
return OMEGA_i + dw * TIMESTEP
def step_theta(OMEGA_i , THETA_i , TIMESTEP) -> float :
dtheta = OMEGA_i # dtheta/dt = omega by definition.
return THETA_i + dtheta * TIMESTEP
MAX_TIME = 10
TIMESTEP = 0.05
timepoint = np.arange(0,MAX_TIME,TIMESTEP)
w = np.full_like(timepoint,10000)
theta = np.full_like(timepoint,10000)
w[0] = 0
theta[0]= 22.5 * np.pi/180
for i in range(1,len(timepoint)):
w[i] = step_w(w[i-1],theta[i-1],TIMESTEP)
theta[i] = step_theta(w[i],theta[i-1],TIMESTEP) # feed the new angular velocity into the step_theta function.
is all im doing for a manual implementation, and towards the end we feed the stepped angular velocity to compute the step in theta.
l = 10
g = 9.81
def pendulum_step(t,vals) :
W , T = vals
dWdt = -g/l * np.sin(T)
dTdt = W
return [dWdt,dTdt]
TIME_RANGE = [0,10]
THETA_START = 22.5 * np.pi/180
OMEGA_START = 0
solution = solve_ivp(pendulum_step, TIME_RANGE, [THETA_START, OMEGA_START], max_step = 0.05)
# f(t,y) = y^(t,y) , [0,max_time] , initial values , step.
# array will be have values TIME_RANGE / max_step
times = solution.t
thetas , omegas = solution.y
plt.plot(times,thetas)
plt.plot(timepoint,theta)
with the function 'pendulum_step', other than missmatching the tuple unpacking with the return vector, the solution using solve_ivp matches my manual implementation? I hope that illustrates what im not really understanding. :(
if not ill just read the solve_ivp src but that would be rather time consuming i'd imagine and it would be appreciated if someone has already understood it 👍
What you are doing is updating $\omega$ using forward euler and then $\theta$ using backward euler. This is not the default solver that solve_ivp uses
レナ
Compile Error! Click the
reaction for more information.
(You may edit your message to recompile.)
IIRC the default solver is RK45
I am also not sure what you mean by "physical approach"
If one instead does $$\theta_{i+1}= \theta_i + \omega_i \Delta t, $$ then this is still a consistent discretization of the problem.
レナ
Since this is a completely explicit scheme, one has to analyze the stability region to properly choose $\Delta t$.
レナ
Thank you for responding :DD, yes solve ivp does use rk45 by default, I was trying to setup a simple example just to try to understand it's behaviour, if I used the (i-1)th angular frequency when computing a step in theta I was getting a non sensical result , that's all
Have you tried doing linear stability analysis? Probably it is unstable for any $\Delta t > 0$.
レナ
I have not, I have my lab in an hour so I'll ask around
seems that as long as your system is coupled it doesnt matter what variable you step through first to get the subsequent of the other. So i'm guessing scipy just moves in the direction of which you provide the variables in the first place
like you were saying earlier pretty much :)
that solves my doubts so thanks a bunch :DD
is it better to compute A^(-1)b=x or to use gaussian elimination to solve Ax=b directly?
the latter
why?
computing A^(-1) efficiently is the same as using gaussian elimination to solve Ax=e_i for each basis vector e_i
and after that you still have to perform the matrix multiplication A^(-1)b
but asymptotically it takes the same amount of time?
asymptotically they are O(n^3), yeah
what about trying to solve Ax=b_i for each i < m
is there to speed it
up with gaussian elimination
asymptotically?
If you want to solve multiple linear equation systems with different right-hand sides but the same system matrix, then you would usually first compute a so-called LU decomposition of A using Gaussian elimination in O(n^3) and then perform forward and back substitution to solve Ax=b_i for each i in O(n^2)
in terms of running time wouldn't it be easier to just use gaussian elimination on the matrix [A|b_1 .... b_m] and then do basck substitution?
since gaussian elimination gives you the LU decomposition anyways
oh wait i think this method is faster in terms of numbers of systems
You'd have to perform more operations here, yeah
when you're saying forward and back substitution method is it something liek Ax=LUx=b
and then you solve the system
Ux=y
Ly=b
Yeah
Will try my question in this channel, but well, looking for any resource whether it be a book, a video, etc. related to Numerical Methods with this specific kind of format/solving in terms of factorial polynomials, Forward, Central, and Backward difference, that contains numerous examples with solutions. For resources online appear to be in differing formats, and most I've seen do not appear to use the same methodology to arrive at the "final" answer (marked in those that are boxed)
Um... is $x_n=(1/2)^n$ actually right?
I get $x_n=(x_{n-1})^2$ which gives
\begin{align*}
x_1&=(1/2)^2 \
x_2&=(1/4)^2=(1/2)^4 \
x_3&=(1/16)^2=(1/2)^8 \
&\dots
\end{align*}
How does the secant method rely on (sufficient) smoothness?
The formula in Newton's method has a derivative, so obviously we need our function to be differentiable at least once, but with the secant method no assumption about differentiability is made
i think your result is right, and it converges quadratically
secant method relies on fixed-point iteration (newton method too), and fixed-point iteration is guaranteed to converge to a unique solution only if it is (at least on some interval we are working on) a contraction
see banach's fixed-point theorem
and it being a contraction already means it's lipschitz continuous
Yes but that doesn’t mean it’s differentiable
Or more to the point, it doesn’t mean it relies on differentiability to work
yup i meant more that if it's differentiable and the derivative of the iterating function is < 1 then it should work
not the other way
wdym
what i'm saying is that (smooth and with small enough derivative => converges (maybe slowly)) but not the other way around
newton converges for almost every point for x^2 - 2 and its derivative is not bounded by 1 so the second part is not necessary
for the smoothness i don't have a counterexample tho off the top of my head
if you're asking about if the implication is true then yeah
for what...
is this a necessary condition as well if we change consistent matrix norm to the 2-induced matrix norm?
I'm not sure what you've written is an answer to my question or an answer to a subtley different question.
My question, rephrased, is
Is it correct to say that there are problems for which the secant method works but the function in question is not differentiable? (I.e. differentiability is not required to apply the secant method)
probably depends how much non-differentiability you are willing to assume and what is your end goal (e.g. for what values do you want it to converge)
Idk what you mean by "how much"?
Either a function is differentiable or it isn't
Or do you mean like, does differentiability fail at a single point and otherwise it is differentiable?
I'm not too sure if that's what you mean, because that wasn't discussed in the context of secant method
yup that
in root-finding methods it's usually assumed you start in some small neighbourhood of the root, so if it's just a few points in which the function is not differentiable then just choose a neighbourhood that doesn't contain any of these points and you good
otherwise we either have non-differentiability in every neighbourhood of the root but not root itself - and these functions are probably quite pathological, so I wouldn't count on secant method to work for them
or we have non-differentiability at the root, and f(x) = |x| is an example that it can work if the two starting points have the same sign
are any of you familiar with heun method for ODE's?
otherwise we either have non-differentiability in every neighbourhood of the root but not root itself - and these functions are probably quite pathological, so I wouldn't count on secant method to work for them
This is sort of what I was getting at. So if you had a function that was non-differentiable (or maybe just very difficult to differentiate) but it was continuous, then it sounds like secant method doesn't work so well?
Off the top of my head I don't have examples of functions like this (Weierstrass function is continuous everywhere but differentiable nowhere, but I don't know if that has applications to numerical methods)
then idrk any example that would be meaningful
also remember that functions like these are usually very pathological numerically, and by that i mean that if we can't say anything about derivative (and therefore we cannot bound it) then very small error in representation of a number can lead to method behaving very unpredictably
We usually don't want to work with functions like that when using numerical approximations so we care about sufficiently smooth functions
what about it
Ill send more details tmrw but i was wondering how to plot said method on a 2D graph when the output of the method is a vector (at least in my case)
Part c
just plot the positions as 4 curves? 2 for euler and 2 for heuns?
position for m1 is one curve and m2 another curve
and you solve it with 2 different methods, so 4 curves
Ill try it tomorrow
what langugae do you use at ntnu for courses like this?
English
programming langugae 🙂
ok 🙂
i think i figured it out, i think i have to take a riemann sum of the two last elements of the heun y_n vector, R(y_n[2]) and R(y_n[3]) wil approximate numerical values for u and v respetively and plot those values
no now i am uncertain
this is the final code
though i am uncertain if i am plotting u and v or u' and v'
There Are some errors that happen when i copy-paste from spyder to discord btw
Use ``` to start and end codes
print(‘hello world’)
Is 3.7 basically saying "we want to choose the various x_i in such a way that minimises the maximum of that product"?
The notation is a little unfamiliar
yes
or you could say you want to find input parameters x_0, ..., x_N such that the optimal value of a subprogram is minimal
where the subprogram seeks the maximum of a single variable function
Okie that makes sense. Another thing... what is the norm_infinity thing in 3.8?
I assume that is a particular kind of metric but idk what kind it is
usually $||x||_\infty = \max_i |x_i|$
Transparent Elemental
this is a norm on R^n, but since P_N is isomorphic to it they should be the same thing
though here they seem to be using the norm for functions, $||f||\infty = \max{x \in [a,b]} |f(x)|$
Transparent Elemental
Hello all, I'm tasked with writing a report on numerical analysis. I've only had a single lecture about the subject so far where we covered Lagrange interpolation and numerical integration
Any recommendations for light topics that fit this caliber? I looked up Pi approximation algorithms and FFT but most got over my head
maybe finite difference methods for heat equation?
there’s dozens of youtube videos doing code walkthroughs and it’s fairly easy to understand if you know a bit of linear algebra and the definition of derivative
as well as some ode/pde
hi all
anyone familar with finite element analysis especially Bathe's book? in chapter 3.3.2 variational formulations. the functional of the problem is been given, without explaination. for example
functional for all examples in this chapter is given like this. I'm wondering how did it drive to the functional
not sure this is the right channel, but I assume it is
oh never mind. later it explains functional are driving backwards by governing diff eq and natural BC. interesting and a bit convoluted
i'm tweaking
can anyone give an explanation of how taylor seires can be used to find the order of convergence for iterative methods?
specifically newton's method
you can read the analysis of newton's method on wikipedia
Most introduction to numerical analysis books contain it too
Is it correct that the order of convergence is 1 and the rate of converge is 1/2?
This code works fine but I'm wondering if it is considered bad form to have "if" and "else" at different levels of indentation, or is that fine?
it's a miracle if that code works as expected because indentation in python changes the meaning of the code
and yes it's bad in general
what would the correct code be?
if you have them on the same indentation then ver only looks at the first element of the list i think
there was definitely a problem
but i was doing it a few hours ago so i cant remember exactly what it was
actually
could you say
python code doesn't even run if you mess up your indentation
so why is it miraculous?
that mine runs as expected
because indentation in python changes the meaning of the code
but is the meaning of the code not what ive inteded it to mean?
well who knows, it's not immediately clear what the intent was and what the actual behavior is until you run it, maybe it works for some and doesn't for other inputs
you might think you meant one thing and the code will do another thing
the intended meaning of the code is to complete the task at the top
all I'm saying is that abusing indentation leads to incorrect code and makes it harder for other people to read
that would be a fail from me.
What’s a correct version?
remove the else:-line and one tab in front of return False
i'll take a look at this tomorrow
thanks for the help
I could use a nudge in the right direction with this one. I see from the given information that p(x)-f(x) is orthogonal to all polynomials of degree 4 or less - and I think this should tell me that p(x) is the best approximating polynomial of f. From there, I am not sure where to go. If I can somehow show that p(x)-f(x) less than \epsilon e^x, then the conclusion follows but I haven’t found any justification that p(x)-f(x) is less than or equal to \epsilon e^x
a short version of your function def ver(x, lst): return True if x in lst else False
i think the else is unnecessary
loop through and return if found
if the loop runs through completion and true is never returned then return false
it is wrong, and this shows why python is a shitty language if this code runs.
it is unnecessary for this context, but else statements on loops are an officially supported feature which activates if a loop terminates naturally (rather than via a break statement)
point stands 😛
You know it is horrible code (and a stupid and pointless addition to a language) when the documentation is formulated as ```(Yes, this is the correct code. Look closely: the else clause belongs to the for loop, not the if statement.)
(if anybody can enlighten me, please do 🙂 )
This is much nicer
Thanks
Another version that is probably the most readable and clear def ver(x,lst): for val in lst: if x==val: return True return False
could you share the pdf? im studying his lectures rn
can someone explain q4 to me
why would be not be using test functions that are 0 on the boundary
in the weak form we can do ū=u-x so that we choose ū and v that are 0 on the boundary and avoid the boundary terms
what is good text for learning predictor-corrector schemes?
Because your original PDE has a nonzero boundary condition, you can try to discretize this problem by using test functions that also have a nonzero boundary condition. If you use a zero boundary condition for the test functions, then no linear combination of those test functions can ever satisfy the boundary conditions of the original PDE
but doesn’t lax milgram require 0 boundary conditions?
unless i’m mistaken
or is it just the conditions on the bilinear form and linear functional over whatever space you’re working in
also, what would my weak form be for the problem? Expanding around utilde=u-x would make it a weak form not suitable for the galerkin method questions
I can. but i can't contact you
sorry for that... could you try again please?..
Every example I see of RK4 shows it using an example like this:
dy/dx = f(x, y)
Then you are given the formulas to calculate the K1-K4:
k1 = hf(x0, y0)
k2 = hf[x0 + (½)h, y0 + (½)k1]
k3 = hf[x0 + (½)h, y0 + (½)k2]
k4 = hf(x0 + h, y0 + k3)
and then finally use a weighted combination to estimate y1 (estimation of y after the time step). Something like:
y1 = y0 + (⅙) (k1 + 2k2 + 2k3 + k4)
But how do I apply it to ODEs that have more than 2 dependable variables? What if I had this instead?
dy/dx = f(a,b,c,d)
Or maybe even more? How do I apply RK4? What would my k1-k4 look like? K1 seems simple to compute with the initial values I already have for the system. But for k2-k4, what do I do?
I would really appreciate some help with this. Please feel free to share resources where detailed worked out examples of this is provided.
y can be a vector in your first things
Does anyone know if there is a generalization of the sampling pattern for triangle sin the FEM to arbitrary regular polygons?
i.e, if you have ar egular polygon you want each side to have n samples
and for the itnerior to be populated with a regular sampling pattern that is roughly uniform
The square pattern is trivial
but starting with the pentagon I can't think of a goo pattern
Haha even for triangles the question is not so easy
But I think your best bet for n-gons is to divide them into triangles, assuming that they are not too irregular
Like divide a pentagon into 5 triangles which meet at the "center"
And then assign points in each triangle, making sure they meet correctly along the edges of the triangles
for triangles it is trivial it;s the sequence yielded by 1+2+3+...+n starting from the top vertex.
The issue with that generalization is that it doesn't match what you would expect at the square
Well you're assuming that you want the points to be equally spaced
Which usually, if you are going to high order things, you don't want
What do you mean? I have my 12 odes like this:
Ummmm are you sure you want to write f for each right hand side
I already have the functions written. That is not the problematic part.
The problem is in working out what should be my inputs for k1-k4
Ok well write [p_n, p_e, p_d, phi, theta, psi, u, v, w, p, q, r] as a single vector
Let's say x
They are part of a state vector
Also what is your independent variable a la time?
Is there one?
If not, then you can write your system as x dot = f(x)
Where x is the state vector or whatever you want to call it
mine is autonomous, so no explicit t
Ok well if you're happy with the notation of x being the state vector
that is the straight forward part, I am not stuck with that. I am stuck with rk4
Say x_0 is the value of all the things at the previous time step
Then k_1=f(x_0)
k_2 = f(x_0+dt/2*k_1)
etc...
I have never seen an example of rk4 applied to more than 2 variables. And in most cases, it is always in the non-autonomous ODE form like x_dot = f(x,t)
Do you understand the k2 that I wrote
I was just about to ask if you could rewrite that as it looks confusing
Can we write latex here?
$k_2=f(x_0+(\Delta t/2)k_1)$
Angetenar
So in words, the functions are to be evaluated at the points where all the dependable variables become:
variable = previous_value_of_variable + 0.5step_sizek1
Yes
Thanks. Also, why doesn't k1 have the delta_t ?
What do you mean
should it not be k1 = delta_t * f(x0)?
Oh
That's a matter of convention
Doesn't really matter as long as you multiply by delta_t enough in the end
My goal is to vectorise like you said and then come up with a function that will apply the same steps to every function and update the state vector
I am not doing this for differential euqations. I just need the smapling pattern to be determinstic and uniform
as uniform as possible as long as it can be topologized easily and the pattern can be easily generated
that's why I am using the triangle and square as exmaples
those are trivial
one oyu do in a double for loop with constant indices
the other you do in a double lopp with the iner loop incrementing its iterations
you could not make it any simpler
@wide spear Are you familiar with C++?
hi yall im wondering about the benefits of using LU factorization vs taking an inverse vs row reduction when thinking about solving a matrix in terms of programming. mostlly just need very basic answers for an applied linear algebra exam tomorrow
reading off time complexities and what not seems like it really varies ig
the professor made a point that there is some roundoff error when we take some inverses cuz of the way real numbers are stored
also stuff like row reduction not being quick as back solving for triangular matrices
no clue how much he wants us to know for the test, shouldve gone to office hours 😭
@stone hill do you have Heath's textbook?
it's all answered here
that is what your prof is looking for. Understand and memorize. You will be golden
and by row reduction I assume you meant Gauss-Jordan Elimination. Which is in 2.4.8. Basically Gauss-Jordan Eliminationhas higher cost than LU decomposition
Inverting a matrix can introduce more floating point error if the matrix is ill conditioned
An upper triangular matrix is already LU decomposed
LU is basically gaussian elimination but it’s reusable since it’s not dependent on the right side
and from what i learned in my numerical analysis course, there’s almost never a good reason to invert its less accurate and more expensive than direct solve
I'm in the middle of the test (we get a recess) and he said to list three reasons
I think i listed em all
I hope i did
Good luck sir, next time try be ahead of the curve, know everything a week or more before the exam
Yep yep, I do try to have exam material in the back of my mind at least the week before and I answered correctly iirc. Next time I'll ask early lol
🤞🤞 hopefully it went well
Can you use a second order approximation of the neumann condition by doing something like $au(x,t)+bu(x+\Delta x,t)+cu(x+2\Delta x, t)$
Whoever
So you get $$-\frac3{2\Delta x}u(x,t)+\frac{2}{\Delta x}u(x+\Delta x,t)-\frac1{2\Delta x}u(x+2\Delta x,t)=u_x(x,t)+O(\Delta x^2)$$ and so you get
Whoever
If $u_x(0,t)=0$ is the Neumann condition then you have a difference scheme by solving for $u(0,t)$ so
$$u(0,t)=\frac43u(\Delta x,t)-\frac13u(2\Delta x,t)$$
Whoever
Yes
Please plot better
You can rotate the plot with a mouse
anyone knows a relatively comprehensive book on multigrid methods?
Multigrid by Trottenberg, Oosterlee, and Schuller.
does the time step matter for image proccessing ? especially where it is chosen to be proportional to spacial step and only the proportion is left during the calculation
i understand that the time step is critical in modeling a physical phenomena to know what is happening in a specific time, but in image processing we are not interested in time rather than the iteration itself in genral
please ping me for insight
I'm a little late but your prof makes a good point. Inverting a matrix does potentially create some computational errors when applying this in practice (as I dimly recall from my linear algebra days).
r there any resources online where there is publically available data for any expeiment/measurement besides research papers? like a github community or something
do you guys think numerical analysis is harder than real or functional analysis? I see too many computations and my mind halts
ive only taken numerical and it seems pretty managable. It's just alot of memorization of algorithms in the tests
or rather memorize what means what
from what i heard of ppl that took real, real is harder
you could look at Kaggle
a lot of free datasets
thank you, i will look into it
I did have that feeling but once you start implementing you get what it is about, the calculations are done usually by computer, and numerical analysis is more about defining and studying if the algorithm is going somewhere meaningful or not, as well as getting insight on how we define an algorithm that will gives meaningful results
Is it valid to say that for "large n", the first sequence is approximately equal to 2^-n and then work with that sequence to find the order and rate instead?
Sure
how else would you calculate it (by hand)?
evaluating $$\lim_{n\to\infty} \frac{ \frac{1}{2} 2^{-n} + 3^{-3\cdot 3^n} }{(2^{-n}+3^{-3^n})^q}=\mu$$ seems like a nightmare
Douglas
Yes that's why you ignore the 3^-3^n
If you see many computations, it might be numerical methods instead of numerical analysis.
Edit: I will try to elaborate more as an effort to respond to the question mark. When I took a numerical analysis course, it was based on a textbook titled "Theoretical Numerical Analysis: A Functional Analysis Framework" by Han and Atkinson. Since @solid igloo mentioned that they saw too many computations, this contradicts with my numerical analysis course experience. As you can see from the textbook, it is a mainly proof-based course with not much computations, which is how a numerical analysis course should be. If the emphasis is more on computations, then I believe that is learning numerical methods, not learning numerical analysis.
So, is numerical analysis harder than real or functional analysis? I think the difficulty should be similar if it is the numerical analysis that I believe. Otherwise, I cannot say much since it depends on whether you like computations or not.
?
@wide spear Please would you take a look at my attempted answer for the second one?
$x_{n+1}=e^{-n^4} \cdots e^{-1}$ and $x_n=e^{-n^4}$, so $$\frac{x_{n+1}}{x_n ^q} = e^{(q-1)n^4} \cdots e^{-1}$$
Thus the order is $q=1$ and the rate is $\mu=0$
The rate doesn't seem right as apparently when $q=1$ we should have $0<\mu<1$, which isn't the case here
Douglas
I am not square what your x_{n+1} is supposed to be
binomially expand (n+1)^4
and then use the fact that exp(x+y)=exp(x)exp(y)
Yeah so where have I gone wrong?
the full expression for the quotient is $$e^{(q-1)n^4} e^{-4n^3} e^{-6n^2} e^{-4n} e^{-1}$$
Douglas
So you can see that if q>1, the limit is infinity right
yes, because n^4 would have positive coefficient
and hence blow up
so the only option that avoids this is q=1
but then you still have all the "decay terms" from the other exponentials
and they go to zero
So the limitation of the definition of rate of convergence that you are working with is that it is limited to things that decay "slow" enough
This example decays too quickly to fit into the framework you've been given
answer updated to address the question mark.
Numerical methods and numerical analysis are the same things
It can be taught in very different ways
How to do this?
I said that $f(x)\approx \frac{1}{6} f^3 (x^) (x-x)^3+\frac{1}{24} f^4 (x^)(x-x^)^4+\cdots$
Putting $\epsilon_n=x_n-x^$, we have that $f(x_{n-1})=\frac{1}{6} f^3(x^) \epsilon_{n-1}^3+\frac{1}{24} f^4 (x^) \epsilon_{n-1}^4 $ and $f'(x_{n-1})=\frac{1}{2} f^3(x^)\epsilon_{n-1}^2 +\frac{1}{6} f^4(x^*) \epsilon_{n-1}^3$
Subbing this into the formula for Newton's method and re-arranging, we get $$x_n=x_{n-1}-\epsilon_{n-1} \frac{1+\frac{1}{3} \frac{f^4(x^)}{f^3(x^)} \epsilon_{n-1}}{4+\frac{f^4(x^)}{f^3(x^)} \epsilon_{n-1}}$$ or equivalently $$\epsilon_n=\epsilon_{n-1}-\epsilon_{n-1} \frac{1+\frac{1}{3} \frac{f^4(x^)}{f^3(x^)} \epsilon_{n-1}}{4+\frac{f^4(x^)}{f^3(x^)} \epsilon_{n-1}}$$
Douglas
The algebra after this seems quite yucky so I think I might have gone wrong
If no one else answers, I can answer in ~5 hours
okie ty
Ok try but instead of including 4th derivative terms, use only the 3rd derivative in the taylor expansion
Let me know if you do this and still can't figure it out
Hello. How can I check that the error of and aproximation of a discrete sum by an integral is bounded?
In particular, I want to prove this fact. I know that the sum on the left can be aproximated by the integral of $1/(x^2 +y^2)$ in the disk of ratio R. Once I evaluate that integral the result is $2\pi \log R$ but I need to prove that the difference between the sum and the integral is O(1)
eduardo291299
I'm not sure if this is suitable for the tools of numerical analysis
I would recommend asking in #advanced-analysis
I will try to skim through that book, thanks, I had it on my computers as I looked for applications of functional analysis to motivate myself
Is it just linear? I'm thinking $f(x) = x^m$ where it's linear
L
In most cases, a graduate level numerical PDE course has functional analysis as its prerequisite
Hi, I am trying to understand the dual simplex method for maximisation, but many online resources seem to teach it differently so I am confused about the minimum test.
what is the difference between row 0, the objective function, and Cj-Zj? they do not seem to be numerically equivalent to me, but different sources use different ones for the numerator in the minimum test.
What exactly are the rules and edge cases for the minimum test? I have seen lectures stating to use the smallest non-zero absolute value, others saying to use the smallest megative value, so on.
i had made a post in #help-7|zen1thxyz and was directed to post here instead
In that case we have $f(x)=\frac{1}{6} f^3(x^)(x-x^)^3$ and $f'(x)=\frac{1}{2}f^3(x^)(x-x^)^2$.
Hence $$\epsilon_n=\epsilon_{n-1}-\frac{f^3(x^) \epsilon_{n-1}^3 /6}{f^3(x^) \epsilon_{n-1}^2 /2}=\frac{2}{3} \epsilon_{n-1}$$
Douglas
So it looks like the order is 1 and the rate is 2/3?
Yep
with which topological vector spaces? I am trying to read Rudin and the variety of spaces is boggling my mind, I asked a professor and he told me basically to see everything as a Banach or metric space