#numerical-analysis
1 messages ยท Page 1 of 1 (latest)
conjgrad gives x
you do x-b
obviously not zero
ok discord distorts your code, use three ` to paste code
also your matrix needs to be symmetric
why not just \
that is kind of the whole point of matlab
x=A\b
don't write your own subroutines for this use BLAS/LAPACK/MKL
how so, they are default in any os (except maybe windows?)
yes they are on mac
probably is what you want
find the solvers here depending on your matrix https://netlib.org/lapack/explore-html/d8/d70/group__lapack.html
you should use julia instead :---)
but i gues prerequisite to use fortran
you can call call julia from other langues or the other way around
(and I assume also fortran, never done it myself)
what do you solve for (I worked on cfd solver in f90 for a couple of years)
what kind of problem?
ok, don't know what that is (I don't do CFD any more) I worked on this code https://docs.snic.se/wiki/Edge
What size is your linear system?
So you mean 10000x10000 linear system then
So it sounds like you have some difficulty in linking to external libraries?
I think Sven Erik's suggestions are very good. There is also Pardiso which has some heavy duty linear solvers
I am also primarily a Fortran user so if you have questions about how to get it set up you can ask further questions here, or DM me
I would advice to learn to use something else. I have never heard about "IMSL" so had to google it. Those libraries are not used anywhere (just look at "customers", only one company is there by name, everything else anonymous) According to https://en.wikipedia.org/wiki/IMSL_Numerical_Libraries the latest fortran version was released in 2010 (no idea if that is accurate)
IMSL (International Mathematics and Statistics Library) is a commercial collection of software libraries of numerical analysis functionality that are implemented in the computer programming languages C, Java, C#.NET, and Fortran. A Python interface is also available.
The IMSL Libraries were developed by Visual Numerics, which was acquired in 200...
I don't know everything about this but I have worked with this kind of stuff for many years and have never seen it before ๐ And the homepage made me pretty suspicious ๐
don't listen to a stranger on the internet :---)
Since you're using Fortran and you just want to solve a vanilla linear system, I would just use something open source, can you probably find examples online, for example on netlib
10000 x 10000 isn't even that large
you could even just do it dense, nowadays
For SPD system, you could check out, for example, https://people.math.sc.edu/Burkardt/f_src/cg/cg.html
then look for gmres
I would consider myself a reasonable expert on these matters, though not on using specific libraries, so we can discuss further here after you try something
Sven Erik also sounds like an expert
I am not ๐
Concerning an iterative method not giving you a result, the story is always the same: iterative methods are actually bad on general matrices. Preconditioners are always needed.
You're a working academic too, right?
this is me www.2pi.se
I'll call you an expert ๐
on some things, not really cfd ๐
and you?
Oh, me neither. I do actually work with fluids sometimes, but my background is in numerical analysis and linear algebra, primarily fast solvers
I'm working on my website, I can show you when it's done, haha
I work in industry however
My team is always hiring, in case any in this discord is looking. PhD recommended or essentially required.
i am 45 ๐
what company? always people graduating here
nope. my first advisor started doing undergraduate when he was 35. he used to be a music teacher
I work at TSMC North America, which is based in San Jose, CA, United States
ok!
you in russia or kazakhstan?
ah
there is a Kazakh person joining my team
and one of my colleagues when I was a postdoc was also Kazakh
haha
well, I guess I cannot comment on that because I mostly know people doing math, haha
or physics, or computer science
good luck with your phd. I just got my first phd student ๐
I decided tenure track was not for me because I didn't have a good research program that could really support students. I probably wouldn't be a good adviser
I would have (or never gotten hired) been a very bad advisor if I continued with CFD. I was just pure luck I found an infinite well of ideas in linear algebra ๐
that's great. I probably should've leaned more heavily on linear algebra but I picked the wrong adviser and wrong postdoc for that
I was supposed to write a preconditioner for DG using a special technique, but found a pattern in errors that became my whole new phd these. I just switched fields in 1.5 years
I was pretty close to pure mathematics actually, but that's actually how the trouble began
yes advisor is important. i hope i dont become a bad one
a lot of the stuff I was working on just wasn't that "hot", even if it was interesting
the right way to go about it if one wants to stay in academia, is to find a second postdoc once you know what you want to do
and I had opportunities for that, but I decided to switch to industry where I could continue working on math but with a more directed focus
coming out of a PhD you don't really know anything other than exactly what your thesis was
haha well, it turns out I have nothing to spend my money on, the prime years of my life are spent in covid isolation ๐
that's why I joined discord, I was bored!
I was very lucky about postdocs, I was contacted and offered 2 positions after each other where I could do what I wanted, and then I got my own funding, which pretty much gave me the first TT position I applied for (at my old university in the city where I already had a house...).
same
I actually had plenty of funding and freedom too, but actually what I needed was more guidance in switching research topics, because my old research area was a dead-end
that is tough
I think I'm OK with it. I was finally able to switch fields and do something more fruitful later.
How is covid now in Canada and US? Here everything is normal (but not for me since I have had a transplant)
there is a new surge
I live in the big city so the risks are simultaneously higher, yet people are still partying hard
may i ask what kind of stuff you work on at such a company as a mathematician?
can anyone recommend a reference work for nonlinear programming
Computational physics for modeling the fabrication and material behavior of integrated circuits. My general research interests include numerical analysis, optimization, PDE, inverse problems
sounds very interesting, what kind of equations do you work with usually? drift diffusion systems? and what kind of inverse problems arise in this area?
We work with a lot of systems, depending on what you're trying to model. Optics is modeled using Maxwell equations through inhomogeneous media. Photochemical reactions can be modeled in a variety of ways, machine learning, level set methods, and all sorts of other empirical models. Plasma etching can be modeled using drift-diffusion models coupled to an electrostatic or electrodynamic problem. Chemistry can be modeled using statistical methods, monte carlo, but could also be done using density functional theory or other forms of quantum chemistry. Ultimately the inverse problem we want to solve is, to put it in mundane terms, the manufacturing inverse problem: given the target object, how do we manufacture it by selecting the correct manufacturing parameters?
all Krylov subspace methods will have convergence that depends on the conditioning of the linear system. Have you checked the condition number?
that is somewhat large, though not insanely large. Have you tried GMRES?
play around with it
depends on what you want. tolerance tells you what is considered a "good enough" solution. If you want it to be accurate to 10 digits it might never converge. And then max iterations tells you when to stop, if you can't reach the tolerance
For GMRES, most convergence results are based on SPD matrices, however it is possible for the system to converge even if it is not.
Indeed, it is an open problem to provide good theoretical justifications for why non-symmetric problems can still have good convergence under GMREs.
No, this is saying that the cost of each iteration is O(n^2), since it is applying a matrix-vector multiplication. In practice it is cheaper than this since you typically use GMRES on sparse matrices.
The number of iterations is pretty hard to determine apriori; for well conditioned problems it could be in 10 or less. If the number of iterations is more than a 100 or 200 it often means that you need to use a better preconditioner.
that sounds incredibly cool actually, thanks a lot for your elaboration! i should keep an eye on such jobs for after graduation
Is there a lot of communication between the Taiwanese in American teams?
Ik TSMC likes to keep lots of the advanced tech under lock and key
Yes but there is careful security
this seems impossibly slow unless you're running on a very low power or old CPU.
also how does it take 500 iterations on a 400x400 system? That is more than 400
can you send a pastebin with the matrix?
sure, i dont have a clue about fortran tho :^)
I think there still must be something wrong if it is taking 64 second for such a small linear system
but each system is only 400x400? normally you can solve such a system in 1/100 of a second
if you are running this on a small laptop then maybe 0.5 sec is realistic, not sure
in fortran how are you doing the vector operations such as "x = x + alpha * p"
how many iterations does it take
more than 20 or 30 would be a red flag to me, with a system as small as 400x400 sparse
do you have the vectors for this?
OK. It's probably not a big deal but normally you should use BLAS operations to avoid inefficient memory access
also, if your matrix is very sparse, make sure you are not storing it dense
but either way, I feel # iterations should be the key here
can you send them too, then i can check on my python implementation
the first column is the initial guess?
hmm idk my implementation is extremely quick
doesnt even take a second to solve this
o.o
well im using sparse matrices to be fair
yeah its in python lol
so i can just use all the sparse matrix stuff from scipy
Fortran should be fine and should work well, probably you are just missing some important implementation tricks to make it run fast
But as long as it works you at least have something to benchmark against
Do you know about matrix free ?
Matrix free means you don't ever store a matrix
Because you only ever need Ax, you can simply supply a function f(x) = Ax
You never actually use A
You only ever multiply it by a vector
It depends. Since your A comes from a PDE, it can be possible to compute Ax faster than doing a real matvec
Since Ax is likely some sort of numerical differentiation of the physical solution x
I'm mostly worried about the time it takes to do even one solve. It seems pretty high, but I'm spoiled by using fast low level libraries and fast CPUs
So try the sparse thing and see if that helps
I seem to recall your real case is a bigger grid, in which case your current code will be way too slow
It should be covered in a numerical algorithms course
Well, that's about PDE
You need something that is more about implementation of numerical routines
You need both. PDE course tells you how to turn a math problem into a computational problem
But a numerical course will go over the details of the algorithms
At the undergraduate level there is a lot of detail that is skipped though
For example general galerkin schemes
Also meshing
Also theoretical stability issues are a classic mistake by engineers who are good at programming
It's easy to write a fast algorithm that blows up๐
Saad should have some resources on iterative solvers
you can probably get some inspiration here https://www.dealii.org/current/doxygen/deal.II/group__matrixfree.html
Hi, do you guys know whether thereโs a numerical method channel? Iโm having some thoughts about symplectic integrators and would like to ask some questions about it (in particular: if there are any good recent research on varying the time step for the symplectic integrators? Iโve read quite a few research papers on it and seem like varying stepsize for those is met with vague resistance)
this is p much the numerical method's channel
or i guess more accurately, where you would want to ask about that sort of thing
you could try some diffeq channels on julia slack if you do not get any feedback here.
:O thank you, I will!
e.g. you find https://chrisrackauckas.com/ there
website description
they only do fixed time step it seems, so probably there is a reason behind it https://diffeq.sciml.ai/stable/solvers/dynamical_solve/#Symplectic-Integrators
Oh word. I kind of know the reasons why fixed timestep is the thing, but yea... there ARE research papers about adaptive timesteps. I will reach out to the Julia group, thank you so much
i think it lets you solve systems of equations extremely quickly
and with very low memory usage
:^) i just know that in scipy using the banded solver is insanely quick, even compared to sparse solvers
๐
it uses superLU for the general sparse solving and LAPACK for the banded solver
and as far as i know superlu is insanely fast
i wrote an advanced heat equation solver in python and using a banded structure was so much quicker than using any sparse solver
yes yes but i dont think youd have to actually
writing banded solvers should be quite easy if you know exactly what structure you have
you could do a banded cholesky decomposition
actually dont know, you should compare it :^)
very easy to use in julia https://github.com/JuliaMatrices/BandedMatrices.jl
arent you the one using fortran? ๐
Fortran is the definitive language for linear solvers
Anyways, banded matrices are usually fairly simple to store because it is stored as some number of vectors. This improves memory access performance compared with sparse CSR format which is essentially unstructured
Consider the simplest non-diagonal situation: your banded matrix is symmetric and only tridiagonal. How would you write a matvec then?
so how is this different from your actual situation?
why wouldn't the tridiagonal matrix be stored symmetrically?
is that not the same in the case I mentioned though?
Do you have somebody in your group who can assist you? If not, we can speak more at length later. Let me know.
hi... trying to read karniadakis's spectral-hp methods for cfd and realizing that... I know nothing about physics, much less a lot of fluid mechanics
are there any fairly brief albeit rigorous introductions? In particular I'm looking to see the derivations of some of the basic equations
or if someone could point out a starting point for me to work on my own, would also be very much appreciated
LeVeque's FVM book or Blazek's CFD maybe
what do you want to solve?
maybe try out https://github.com/Nek5000/Nek5000
less interested in the solutions than the methods... mostly trying to get a feel for spectral-hp
a friend recommended it
thank you... will look into them
ah... fortran, my old friend
๐คฃ
they are not for spectral methods but more standard RANS etc
I would say you have more use of DG methods than spectral methods unless you want to do DNS with "accurate" turbulence etc
yeah I see a lot of DG things
I actually started a new collaboration last weekend on a pseudospectral method
with fluids people
translated the code to julia with high precision data types and get very nice results not possible to attain otherwise ๐
sorry -- what marks it as "pseudo"? something w choosing the basis functions? or the discretization somehow
no idea ๐
love julia sm. so much more satisfying to use than python but still just as easy to program in
not that I have finished many things yet... only entering third year of college atm
but definitely much easier to use and easier to debug
this is the paper i reimplemented https://www.mech.kth.se/~shervin/pdfs/2009_amr_review.pdf
hopefully the GPU things get hammered out over time
took a few hours ๐
well i just started to look at this (I only do spectral analysis of matrices, so that is what I do this for), it is not a complicated code. I usually include code in my papers
(and I have only looked a spectral methods one time before, so I don't anything about them really)
I only really have a very high level understanding of FE/VM and spectral lol
I remember talking to you months ago ab having a screwed up FDM matrix for an assignment though
my knowledge has worked out so far though lol... FDM has served me fairly well.
more recently I've been with a lab trying to take a MATLAB code for an ODE system that uses a spectral collocation for the heat field(? I understand this to be the correct term, but sadly, again, I am really lacking on the physics education) to simulate, of all things, bubbles
surprisingly interesting from an engineering standpoint... there's this one talk I really liked
Produced in association with Caltech Academic Technologies. ยฉ 2006 California Institute of Technology.
- Learn more about Professor Chris Brennen: http://www.its.caltech.edu/~brennen/brennen.html
- Learn more about Caltech's Public Events Series, including the Earnest C. Watson Lecture Series: http://www.caltech.edu/public-events-series
I wish we had done it in julia over python, but I don't think we could have gotten the same OO sort of modularity
currently at somewhat of a stand-still lol... the PI seems to have had a very old version of the code so the paper he told us (me and one other who is halfway on this particular project) to go redo is just totally inaccessible
lots of bugs in the physics, but, again, the challenge was that neither of us (since he is also applied math + computer science background) properly understood the physics
so we could not confidently say whether or not a particular result was realistic outside of the absolutely obvious
excited now though because, after weeks of asking whether or not (really telling him) the original code was just deep-fried, he finally looked at it and realized that the original code was just deep-fried. got the bugs hammered out, so it's pretty much at completion
it was missing some serious chunks of what would have been needed to generate like 8/10 of the figures in the paper lol. definitely got written off as the "idiot undergrad who can't do these straightforward things" for that while
ah ok
the pseudo bit makes sense then
LOL I almost just redid it one weekend
well
the idea is that we take that solver
generate 1000's of outputs
and use to solve the inverse problem
which is what we are actually interested in
and well
python is certainly not known for its throughput
got it
this is already >>>>>>>>> the karniadakis
this is math --> physics
vs physics/ engineering --> math lol
thank you
I mean for us it's "ok"
we are solving the keller miksis equation
which is just an ODE
and in some cases we have a PDE or two to solve
as part of the entire physical system
lol I remember
trying to run something with like 7 million time steps * 2400 space steps
1d finite difference scheme
oh my god
took 4 hours
I thought 4 hours was crazy
I had done that in julia
my buddy had implemented in python
he had something closer to 8-9 hours
pretty sure he screwed something up with for loops...
ptsd from poking through golub
๐ญ
Windows

๐ฆ
lmao
Are you intentionally trying to write everything from scratch?
You are a new student right? It is helpful to do it from scratch at least once.
However, in terms of portability, using open source libraries should be no different than writing from scratch
Ha ha ha
Looks great
haha, well, it looks pretty at least. I have no idea what the solution is supposed to look like!
correct in the eye norm
I'd love to help you but I already do 8 hours of debugging every day lol
I write numerical software so it's not quite as bad for me since I can write the code myself, properly
have you tried seeing what happens when you just do it with 3x3 matrices?
lol grading is so nice
not for proof based courses
anything at or below vector calc/ calc 3/ multivariable is a huge win
TV show and, indeed, see answer -> right? full mark unless 0 work wrong? -> does the first half look right? half mark. if not? -> no credit
my professor dad always tells me "graduate students are slaves"
This is highly dependent on funding structure and department structure
In my graduate school, most students were funded directly by department or university fellowship so they were not in any way beholden to their adviser; their adviser was merely the main person for them to collaborate with them and to give them mentorship and research directions.
floating point arithmetic is not associative
(I thought it might also not be commutative but I suppose in the sense of binary operations, it is commutative)
Yes, because of rounding. 1 + (2^60 - 2^60) makes 1, but (1 + 2^60) - 2^60 makes 0 because the result of the first addition rounds to 2^60.
You should check out a numerical analysis text. It'll cover some of these issues in more detail, especially in the context of matrix operations.
there are also other numerical issues with algorithms try for example in matlab:``` n=5;a=[1 2;2 1];A=kron(diag(ones(1,n)),a)+kron(diag(ones(1,n-1),-1),a);
eig(A)
eig(A')
moving the conversation from mv calc, from something i just watched it sounds like i need to evaluate the different scenarios for inequalities separately by rewriting the lagrangian to include or exclude my constraints
I'm not familiar with that approach, it sounds bad tbh
the whole point of transforming your problem is that it becomes easier to solve, yet the solution of the original problem is recovered (or partially recovered, but this is where a whole book worth of theory comes in)
alright, im watching a bunch more that cover the khun tucker in more detail
maybe im misunderstanding this video?
it sounds like he just solves the separate lagrangians with different combinations of the constraints, but in practice this doesnt help me since none of the solutions end up being usable.
the lagrangian with both active isnt solvable, the one with both inactive just minimizes variance with complete disregard for anything else, then the ones that are solved with on condition active are both invalid solution when plugging into the constraints
here he skips some steps, namely that the initial condition is that $(x_1 + 2x_2 - 3)\lambda_2 = 0$, which can be solved by either setting lambda to zero or the constraint, there's also another assumption that all lambdas related to less than or equal to constraints are non-negative (flip the sign on your greater or equal constraints to match that)
Transparent_Elemental
you also should understand that constraints like $w^Tw = 1$ can't be "inactive" because they're equality constraints
Transparent_Elemental
if that equality constraint is not met then your solution is wrong, however for inequality constraints that's not always the case because they form an entire region where that constraint is satisfied
oh right right i forgot i was only making the returns an inequality
jesus, theres a concert a mile and a half away and i can hear the music like its my neighbor playing
though as I said earlier this can still be somewhat problematic to solve because you can't fully avoid the possibility that the constraints are inconsistent and no solution exists or that most of your problems could be due to the entries of a matrix being all near zero
ok so, right now i have two cases: one with the returns constraint active and one without. My understanding was that the solution needs to have a positive lambda for the inequality constraints, but i am still writing the function as before
f - Lm1 * (g - 1) - Lm2 * (h - ExpectedPortfolioReturns)
something like this for the active constraint
f - Lm1 * (g - 1)
and this when h is inactive
essentially yes
yea the solution for the second case doesnt meet the constraint when plugging in
different texts use a minus or a plus sign and I'm not sure what's up with that, but you may as well try to put a plus sign instead of a minus in front of each lambda
oh wait, thats weird i noticed that if they use plus then they rearrange whats inside the parentheses, this guy doesnt i think
well yes $(a-b) = -(b-a)$
Transparent_Elemental
but for equality constraints this doesn't matter
didnt help
I think scaling your matrix by some positive constant might be beneficial as well since that preserves the positive definiteness and makes the condition number smaller which improves numerical stability
ill try it
this is called complementary slackness
am i looking to scale it relative to other values?
my average returns are also in that range
my weights outputs are quite a bit larger though
well if the matrix entries have some sort of interpretation, then that interpretation would also change
I suppose that should also affect the average returns if that's connected
well the variance matrix is pretty isolated
i only use it in my main function
the mean returns constraints just looks at the average daily returns of an asset, but i guess when they get written together as one function it may be difficult to work with such different magnitudes
yeah that's a common problem that needs a fix
essentially this leads to really elongated function's surface and poor convergence of most algorithms
0.014136 -0.000714 0.000540
-0.000714 0.001607 0.001108
0.000540 0.001108 0.012717
here are the cov matrix values
[[0.03724025]
[0.00341955]
[0.03586755]]
and the returns matrix
i mean i scaled them both by the same factor
so im not sure what ive accomplished necessarily
unless it does help if i move away from small values
it made the b) a little closer to c) that's for sure
though the constraint might have changed unless the values in return matrix are related by same constant to values in covariance matrix
just to clarify, for the three functions: f = weights.T times covarianceMatrix times weights, and the 2 constraints: g = sum of abs(weights) = 1 and h = weights dot returns, i scaled both the covariance and returns matrices by 100
but theyre not related in any way
when I said related I meant in more real life way related, like if you're modeling say a car and you're optimizing the function which outputs values which are interpreted as liters, then it doesn't make sense to have constraints measured in kiloliters for example
and here the function and constraint are related by constant "kilo" which is 1000
if this is what you meant by them not being related, then do the scaling for the matrix only and not for the constraint
actually im not sure y im saying not related, i mean the covariance matrix is literally the variance of those very same returns
i just mean that scaling one wont affect any other calculations with the other
i also tried just scaling the covariance matrix the first time but it didnt help
i still have to fish around to find a good target return to find some solution
the proper way to scale this would be to scale the matrix and calculate new returns for these new scaled values, even though they might not match the reality anymore since that's just a numerical trick to have better convergence
but the solution should be identical
im a bit confused about what u mean for calculating new returns. do u mean also scale it? both the mean returns and covariance matrix come from the same data table containing percent change in prices
I'm not actually sure how to perform this in this specific case because it almost always depends on what the formulation of the problem is and how one would scale it, assumptions used etc
no, actually I didn't think of that
since it's linear constraint scaling it by positive number doesn't change anything, so you're safe
might just scale the matrix alone and try different scaling values as long as they're positive
i mean i can think of one solution for example
where im just 100% in the asset that meats my target return
ahhhhh, man i was a bit let down by the inequality constraints
didnt work at all as expected
idk what to do at this point, i hate stopping on a problem mid way... but technically i learned how to solve the original problem where weights add up to 1. On some level it has some meaning / use if i scale my values back down to add up to 1. i get less returns but very stable curve (100% overfitted). I can also claim that since i offset positive weights with negative weights for correlated assets then im offloading some of the risk for leverage etc
ok im gonna have to go to sleep. I'm probably just gonna try to wrap this one up with the original constraint without absolute values
you can try some more with non-linear constraints $\lambda g(w) = 0$ instead of trying for each separate case
but i appreciate all the help. a lot less frustrating when im not doing it alone
Transparent_Elemental
wdym for each separate case?
wait so what alternative are u thinking of
u mean specifically when im defining constraints
or when im differentiating the lagrangian
just keep the problem as $L(w,\lambda_1,\lambda_2) = f(w) + \lambda_1g(w) + \lambda_2h(w)$ and constraints $g(w) = 0, h(w) \leq 0, \lambda_2h(w) = 0$ and try to use some solver for non-linear equations
Transparent_Elemental
I'm not sure if you did everything correctly there, but the lambda * h = 0 constraint is there only for inequality
like, yeah, differentiate lagrangian with respect to w here and set it to zero as well
and transform $h(w) \leq 0 \rightarrow h(w) + y = 0$
Transparent_Elemental
essentially now your variables are w, L1, L2 and y
im still pretty unfamiiar with the inequality constraints...but the part i did do was just differentiate without the inequality constraint term
if y happens to be less than 0 or L2 is negative at a solution then the constraints are inconsistent
f - Lm1 * (g - 1) - Lm2 * (h - ExpectedPortfolioReturns) -> became -> f - Lm1 * (g - 1)
but the solution wasnt good
did you keep the constraint on returns? as in when passed to solver
no i made returns the inequality and kept the weights = 1 as the equality
I mean when you did this
oh that was just manual
like i rewrote the original without lambda 2
i thought thats what i was supposed to do, just get rid of the term
and then differentiate and solve the new system of equaitons
that only works if lambda 2 is zero or the constraint is satisfied
so here you assumed that equality constraint is always satisfied (which as I explained earlier you can't do)
the equality to 1 constraint?
the (h - ExpectedPortfolioReturns)
ok, Lm1 term is the sum of weights must equal 1 term
its the equality
and the Lm2 term was the inequality
which i got rid of
that should be fine then unless the assumption that it's satisfied at a solution was wrong, which it could be
like minimizing x^2 + y^2 subject to x + y <=1 is a problem where the constraint is always satisfied
well the solution didnt satisfy the inequality constraint
so idk what to make of it
like the returns were basically 0 and it just went for the lowest variance possible
yeah that makes sense since the objective function has global minimum at the origin and that's always the case for a quadratic function with positive semi-definite matrix
that means your constraint cuts out part of space that contains the origin, hence it's important to include it
like this
so is my constraint faulty?
no, it's perfectly fine
constraints that are always satisfied can be safely excluded from the problem and not alter the solution, but in this case it can't be without altering the solution
ah
I just explained why your solution is zero when you got rid of the constraint
yea and the constraint cuts out that region
my bad im losing focus at this point
ok im giving it one last shot with another solver
and them i really should sleep
Could not find root within given tolerance. (0.0403132603376069410332 > 2.16840434497100886801e-19)
this was the error i would get from sympys nsolve
would work on occasion with a different constraint target
i was hoping it would work with an inequality constraint but im not sure it represents what i was looking for. i wanted to capture all the possible constraint equalities beyond an acceptable threshold, i honestly dont see how the method i was trying attempts to do that
this is some really small tolerance - 10^-19, can't you change that to like 0.01?
but yeah, probably try next day or something
sure, idk how to actually
i dont set it in any way
omg
i did it
i needed to manually disable their verify solution feature
and just take the given solution
๐ wait this is a big moment
im tearing up
ty again for all the help
did it solve it?
yes
Matrix([[0.465091640436819], [-4.76464570830837e-7], [0.351474840054128], [0.384214666550238], [0.100018502617098], [0.361464003032105]])
first 4 are the weights, last 2 are the lambdas
how to interpret those again?
in the videos they were also the derivatives of the function with respect to the constraint
there's like an economic interpretation of lagrange multipliers (lambdas), but i'm not a big fan of it and it seemed weird of it
in general the interpretation is that if some lambda is zero, then the constraint associated with it is not strict at a solution (equality constraints are always strict, although numerically this might not be the case)
if lambda is not zero then associated inequality constraint is tight, i. e. turns into equality
also now im extra curious about the inequality constraint... is there really no way to do what im imagining is possible? I dont like setting the target return to a single value thats extremely limiting... btw this solution had constraints written as equalities
i differentiated the lagrangian without any shenanigans or active inactive terms
well the points is that from theoretical point of view the values of lambdas behave the way I explained, but in practice you're never solving your problem exactly - always approximately and that leads to lambdas being somewhat positive
this is what was meant by the error with tolerance
you can solve it for large variety of t and collect it into a big vector for example, then analyze the contents of that vector
Yes thats what im thinking
generally if lambdas associated with inequalities are negative then the solution isn't a solution at all
but lambdas associated with equalities can have any sign
actually I didn't think of this, but you said that this is with an equality constraint, but clearly you can transform inequality by just using slack variables to an equality constraint and solve that since you've fixed the tolerance issue and you're able to solve it with an equality constraint
if the slack variable turns negative then you know your original inequality constraint is inconsistent, but if the slack is zero then the inequality is tight and not tight if slack is positive
I just assumed that you did convert inequality to equality and solved that, so I suggested doing the vector approach to test different t's
Right thats the other method i saw
So those are two different ways of solve inequality comstraints?
Also the solvers only return one solution so its not like im getting some large set that i can then breakdown and test like they do in the articles
yes if by other method you mean setting up appropriate lagrange multipliers or constraints to zero and checking the solution in separate cases
well you can run a loop and change values of t by some increment in each loop, then store the resulting solution and value of t for further analysis
@plucky kayak Im having a bit of a hard time understanding this idea of proportional gradients for this problem
i tried to graph it
heres what i have
it looks like a solution would be minimum value on the black and red regions
but at the point where they all meet for example x=.8 y=.2, the gradient of the blue surface looks like its just a vector with an almost 0 y-component. How should i picture the gradients of the constraints?
what's up with absolute value constraint and 0x + 0y > 0?
i dont think it shows the small values
but i scaled the main function
wait writing it out it actually seems to make sense
in general you don't want to have non-differentiable constraints like ever, that makes most things fall apart
it would be easier to visualize if you plot something like x^2 + y^2 and a constraint of the form x + y = 1
i got an interesting think for the plot i made
wait nvm i just made up numbers and it was acoincidence
so the gradient of x+y=1 is constant and we are looking for where they are proportional
yes
huh, ok its a bit weird to think about what the gradient actually is for x+y = 1 now
but for a circle constraint for example ($w^Tw = 1$ is a circle constraint essentially) the gradient isn't constant, but it'll still be proportional at an optimal solution
Transparent_Elemental
gradient for x+y = 1 is just a normal vector for that hyperplane, gradient for general plane $Ax + By + Cz = D$ is just $(A,B,C)$
Transparent_Elemental
does it represent something? im used to thinking of gradient as direction of steepest ascent
and when i googled it they just gave me the slope formula
oh im thinking of this wrong... it doesnt behave like a 2d line does ti
for constraints you can interpret the gradient as a vector that's always orthogonal to the surface or a line of your constraint
or in terms of hyperplanes it's a vector that defines a tangent hyperplane to the constraint function at that point
$\Omega\nabla\phi$ is a vector that points in a direction that the tangent hyperplane goes along and $\nabla\phi$ is gradient of that constraint that defines that hyperplane
Transparent_Elemental
is there a reason that id doest just follow the same logic of being the steepest ascent
here are two other visualizations
it does follow actually
you can treat x+y - 1 as a function of x and y
but you fixed the value of that function to zero, so essentially you're only reviewing the contour line of that function with value of 0
oh ur right calculating it should check out
so for g(x,y) = x+y-1 the direction (1,1) is indeed the direction of steepest ascent
hmh, im still trying to think of it as a line, but as a hyperplane im just climbing straight up?
well yes for a hyperplane you're forever going into one single direction to infinity
im essentially looking for an intuitive explanation for setting the gradients proportional, i want to be able to explain it to someone who hasnt taken mv calc but now im realizing i dont understand it well enough myself
like setting up the problem and looking for where all the conditions are met on the graph is a nice way to explain things
but as soon as i need to start thinking of contour lines it gets messy
you can search for lagrange multipliers proof until you find one that you're satisfied with, there are multiple ways to describe same thing
ok one last thing about my visual model
its just not clicking
for reference this is what it looks like
i need a sec to edit my image
the point where the absolute value constraint is non-differentiable would've been the point where all constraint gradients are proportional to the function gradient, but the gradient of absolute value constraint at that point doesn't exist so picture doesn't make much sense
i thought my solutions would exist at the points i circled
but i cant imagine the gradients being proportional there
the gradient of your inequality also needs to be accounted for in that picture
since when i slice the blue surface at different z values i just get straight lines
these are also gradients of a single constraint, they're not different
ooh right
i had that thought and it slipped away
otherwise id looking all the way at the top or bottom
it'd be proportional at (1,0), but clearly you'd be dividing by zero then
yea i see what u mean now
alright then i think i get that part
i just need to extend it to multiple constraints
like y the sum of the gradients of the constraints are proportional to the gradient of the function
this is just your standard necessary derivative equal 0 condition for stationarity when you differentiate the lagrangian
you're essentially asking how to find a minimum of a function (which happens to be a lagrangian)
and calculus 1 course responds - take the derivative and set it to 0
well, not minimum, but some extreme point
oh man... yea
ok let me rephrase the question then
alright i need a minute to think about this
im getting some circle reasoning here in my head
so originally for singular constraints it makes sense
partial f wrt to x - lambda partial g wrt to x = 0
im gonna ignore the fact that im still uncomfy with the idea of proportional gradients at the solution but ill skip past that for now
now i learned the lagrangian form that contains all of these systems
so yes the gradient of the lagrangian being 0 is an extreme point, but what about the individual piece: partial f wrt to x - lambda partial g wrt to x - lambda2 partial h wrt to x
we wrote the lagrangian to reflect this... i just dont get y we add up the effects of the constraints
? it's not like we add them up because we can, it's just comes from how we constructed lagrangian
im confusing myself now... hanging on the edge of understanding and having no clue what anything means
$\mathbb{L}(x,y,\lambda_1,\lambda_2) = f(x,y) + \lambda_1g(x,y) + \lambda_2h(x,y) \rightarrow\ \nabla\mathbb{L}_{x,y} = \nabla f(x,y) + \lambda_1 \nabla g(x,y) + \lambda_2 \nabla h(x,y) = 0$
Transparent_Elemental
as to why we even defined lagrangian the way it is - I'm not sure, it didn't seem like there's a lot of discussion about it in the books
ok yea all the confusion just goes back to y theyre proportional at the minimum
everything else makes sense
ill go read some proofs
or try...
wait no, i mean ive seen the contour line intuition, where they barely touch...
therefore the the gradients are proportional since they point perpendicular to the contour lines
basically this... where the contour lines touch, they are tangent, therefore the gradients are proportional
the proof goes something like this: suppose P is an optimal point of a constrained problem, that is f(x) achieves extreme value and constraints are satisfied. Then project the gradient of f(x) onto the constraint and call that vector v. If v is not zero vector, then you can move in the direction of v and achieve a better value of f(x) while still satisfying the constraint, hence P is not an optimal point - a contradiction. Hence gradient of f(x) is exactly perpendicular to the constraint to ensure the projection is zero.
its starting to make sense, all except for the statement that u can move along the direction of v while still satisfying the constraint
oh like as in there is a vector that keeps u on the constraint
wouldnt that be more maximization?
i guess when u set the gradient to 0 it doesnt matter
you don't set anything to 0 here
yea, got confused for a sec going from the reasoning to the final formula
i get y it makes sense for arriving at a tangent contour line, just wouldnt v always point in the direction of the maximum?
we don't care about where v is pointing, we just only care about whether it's zero or not
which is equivalent to gradient of function and constraint being proportional or not
ok thats a nice way to think about it ty
could anyone give me some hints for the following. I am solving an exercise that is about investigating the stability of the following leapfrog scheme:
We end up with these update equations and I want to show that it is sufficient for stability if the eigenvalues of this matrix A are bounded by unity (or their absolute value im guessing)
a hint is given for this exercise, which is basically to diagonalize the matrix but I dont see how this helps
Oops ive cropped a bit too much on that pic
Oh never mind I think I got it. Diagonalizing decouples the equations into multiple 1d relations which is a case we have solved before in that book
hi can someone help me with an implementation of cg in cpp?
i have following problem: im trying to implement this:
it should comput the following
my Code so far is :
the Problem is, that my u is zero and i dont know wehre my error is.
my gnuplot looks as followed:
i hope somone can tell me what i did wrong
internal to your algorithm you must compute A*x for various x; I would first check and verify that this subroutine is correct
depends on eigen values, CG can behave badly if it's not positive definite iirc
If you pick D to be ```
1 โ
โ
โ
โ
-1 1 โ
โ
โ
โ
-1 1 โ
โ
โ
โ
-1 1 โ
โ
โ
โ
-1 1
then D^T*D becomes second order FD, maybe that is what you want?
isnt it just the transpose of what I posted but with sqrt(2) and -sqrt(2) in the corners?
I am a bit tired so maybe wrong
are they missing a term in the first picture, there is a redundant + in the bottom right
and L as written there is symmetric
I never understood why people write assembly code like this for structured grids, when it is so much easier and clearer to just use Kronecker products
Laplace 2D second order FDM: L1=toeplitz(n1,[2,-1],[2,-1]) L2 same as L1 but with n2 then the 2D is just kron(L1,I(n2))+kron(I(n1),L2) literally one line instead of a bunch of for loops. and you can do it sparse if you like
D is easy to do with kron in Julia
role back on
so you certainly don't want to do like manual check of each constraint on each iteration and minimizing lagrangian is still a constrained optimization problem since that also involves constraints on where the solution can be
conjugate gradient is just another way to solve systems of linear equations if you can convert your problem to that form
well i think i still want to keep my constraints
the problem is that this system with lagrangian is likely to have indefinite matrix and you can't apply (without extra tricks) conjugate gradient to that or it'll risk diverging
ok, right now im a bit behind. my thinking was that i am currently minimizing the lagrangian by solving a difficult system of equations, and that using a small increment and gradient descent i could get a potential solution as well
because although i am getting solutions, today i realized that theyre actually incorrect and dont meet my constraints for anything over 6 equations
the error just becomes too large
im not sure how to think about divergence for a function like this
well basically there's many different ways to approach this problem, you may as well try like doing everything manually, but that'll be likely slow and the solution would be suboptimal i. e. it won't quite converge to it even if all constraints are met
ok ill try that first just to get my hands dirty
but ill try to research what the proper approach is as well
the book that I was suggesting already Jorge Nocedal's Numerical Optimization has numerous different approaches to constrained problems like that
there's like constrained newton method, projected gradient, constrained conjugate gradient, interior point methods, directly solving KKT system, etc
alright ty, ill try to play around w what idea is best
Is anyone here familiar with the (von Neumann) stability analysis of explicit finite difference schemes?
Seems like when investigating the stability of coupled difference equations we often get the following relation in vector form:
$$u^{n+1} = A u^n$$
Faputa
What is confusing me is that few textbooks I've read then proceed to look at the eigenvalues of A and conclude that the method is stable if the spectral radius of A is <= 1
But shouldn't we be looking at the norm of matrix A? If A is not symmetric it can happen that the norm is larger than the spectral radius. Still the textbooks seem to look at the eigenvalues even in the non symmetric case. Sounds familiar to anyone?
if solve the recurrence you get $u^{n+1} = A^n u^0$, now on can show that the spectral radius of $A$ is strictly less than $1$ if and only if $A^n\to 0$ as $n\to \infty$, so if the spectral radius is less than or equal to 1 this means that your solution wont blow up
whzup
Oooh okay that makes sense. But does this only work in one direction? As in we know that spectral radius <= 1 means the solution won't blow up. But is it also possible for spectral radius to be >= 1 and the solution not blowing up anyway?
Okay I guess that would be possible if $$u^0$$ was pointing in the direction of the eigenvector associated with some <= 1 eigenvalue
Faputa
yes exactly
Hi, suppose I want to simulate
$y_t = \frac{y_{xx}}{1+y^2_x} -\frac{1}{2}xy_x + \frac{y}{2} - \frac{n-1}{y}$
shiburin
say with initial conditions near $\sqrt{2(n-1)}$
shiburin
which scheme would be better to study the long time behavior?
I am very new to numerical PDE
in what context does this PDE arise?
looks like minimal surface stuff
Rotationally symmetric rescaled mean curvature flow
https://arxiv.org/pdf/2105.06613.pdf i found this, they just seem to use standard finite difference approximations of the derivatives and a simple euler scheme for time integration
Would you mind suggesting another scheme? I think FD needs very small time step and my laptop is not efficient enought to do this
i mean you can easily switch out the euler scheme for any other time integration like runge kutta id say
the problem then is just that you have to evaluate your rhs multiple times
Thank you
i was wondering, is it possible to reparametrise it using (x(t), y(t)) to get an easier, 2d system of equations which maybe is parametrisable by arclenght? this could maybe simplify the problem a bit
hmm havent thought about that. I will do the calculations later
Does anyone have some experience using the Fast Wavelet Transform (FWT)? I'm trying to implement it, but the only examples I see online are for the Haar wavelet, but in my program I'm using Linear B-Spline wavelets.
You should use standard debugging strategies, such as trying a smaller test case, and printing out intermediate values in the calculation.
You should be comparing the two codes, checking their state line by line
The point is that you're going to identify the exact moment at which the state of the program diverges between the two
It's one possibility, yes
It's also worth shrinking the problem size to something manageable and seeing if it differs there too
I would be more interested in checking the difference between the two codes, not the correctness of the algorithm
so the error you should be computing is the error between the two
due to compiler optimizations and non associative flops, there will always be some small error at the very least
you should also check if all functions that you use do the same in matlab and in fortran
or that the indexing is correct since matlab starts at 1
can you send your fortran testing script?
You keep saying this and I'm saying that you need to identify why it differs by debugging it line by line.
Check the value of every variable after each state update
Clearly, at some point, it'll differ, and then you'll be able to pinpoint why the behavior diverges
further, assuming you disable any compiler optimizations, and assuming your inputs are identical (check this too), then both matlab and fortran should use the same IEEE floating point arithmetic, so they should normally be not just similar, but identical
a computation died after 308 days ๐ฆ
painful
jesus
Updated julia packages during the run, and it was global scope, and some check in Manifest.toml broke everything. No idea why, shouldn't happen. So checkpointing with criu wouldn't have helped. Will always use a local env in the future.
you can compute connected components of graphs by using depth-first search, are you familiar with this kind of graph algorithm?
I guess it's just a true/false constraint, but you could encode it as "size of the largest connected component" and check that this size is as large as the # of total shaded cells
you have to compute it with DFS
so you're trying to convert the statement about the size of the largest connected component into a boolean sentence?
I guess in theory this must be possible, but I don't know how. You'd have to phrase the entire algorithm essentially in the correct propositional logic
unfortunately this is outside the realm of my expertise
I guess if I were to do it systematically
I'd start by writing out the algorithm in a familiar way
and then try to convert each piece of the algorithm into the desired boolean form
oh, I just wanted to do a DFS on the graph
That is the known algorithm, at least to me. It completely solves the problem of finding the connected components. But you're right that it does not fulfill the needs of your problem
Hello everyone,
I'm a high schooler and
I am looking for a topic to settle on for a math exploration.
my preoccupation is with the Fourier series and transform. I successfully self-learned Calculus. I learned the basics of it how to get the Fourier series and transform and also the complex Fourier series(knowing it has a lot of applications). I'm mostly fine as long as the integral does require advanced techniques for my high school level.
I now face difficulty. Despite Fourier having a lot of applications, I'm unable to find something solid.
I know the fast Fourier transform and its applications but I would prioritize something more on the mathematics side( I do not rule out the FFT too). my question is there something I can do with it that is not extremely complicated? Something suitable for exploration. I am open to self-learning as long as the prerequisite is commensurate with my level. Or should I drop this topic? I need your insight please ๐.
Excuse my poor English I am not a native.
hey I did something quite similar towards the end of hs too (a few years ago now).
One thing which still remains accessible at this sort of level is trying to solve the heat equation (on a bounded domain) using fourier series or even on R using the fourier transform
Trying to implement dft and fft (in puthon say) and then comparing the time they take is another pretty fun thing to do
That being said if you really want to start learning the math around the fourier transform etc you unfortunately need to have quite a big background (linear algebra, measure theory, complex analysis etc)
i wrote a hs project about fourier series, ft and dft, i can send it to you if you want, it is in french though
Thank you very much. Your comment is incredibly valuable right now. What a coincidence too I am actually French. I would greatly appreciate it. Can I dm you?
sure
can you tell it to only shade a cell if there is a shaded cell in its neighbourhood?
but doing stuff sequentially is not possible?
because if you can force it to only put shaded cells next to shaded cells this should give you a pathconnected pattern
but it also didnt do what i said
but maybe i dont understand the game lol
you start with no shaded cells right?
yeah ok but then if you require there to be at least one neighbour in order to put down a shade on a cell it should give you a pathconnected pattern
put down the first shade, then the second shade must be a neighbour
but idk how SAT solvers work maybe that doesnt work
yeah ok but then the recommendation from mipchunk to use DFS makes sense, especially if you can implement it like you did on the screenshot
so you pick any shaded cell and then you use DFS to count the shaded cells that are in the pathconnected component of this shaded cell and you compare it to the number of total shaded cells, if its not equal it is not valid
idk i feel like otherwise you'd have to somehow account for every possible path connected structure on the board
probably but its an insane amount of constraints
thats a small number for computers tho ๐
ok so there is one trick you could use
for a graph G = (V, E) there are at least |V| - |E| components
so if |V| - |E| > 1 the graph has more than 1 component
in your case |V| is the number of shaded cells and |E| is 1/2 * (sum of the neighbours over all shaded cells)
yes
this only gives you a sufficient condition to reject the board tho
if it is not rejected by this method it does not mean that its a valid board
actually no idea, i have it in my lecture notes but these are in german :X
actually maybe you should ask in #combinatorial-structures i think they should be fit af in graph theory etc.
was there a bug in your code?
well nice, make sure to test it well so that you're sure it works!
Any idea for a) ?
Hint: Lagrangian is the integrand.
Second hint: what's the Euler Lagrange Equation of some Lagrangian $L(y,\dot{y},t)$?
kirby
There are explicit formulas for inverses of tridiagonal Toeplitz matrices (and eigenvalues and eigenvectors of your A are known explicitly)
ok my battle plan would be to use induction to show that the inverse of tridiag(c, 1, 0) is given by $$a_{i, j} = \begin{cases} (-c)^{j - i} \text{ for } j \ge i\ 0 \text{ else }\end{cases}$$ (i think that should be possible somehow) adapt this formula for both $U^{-1}$ and $L^{-1}$ and use the submultiplicativity of the infinity norm
whzup
thank you
Hello everybody here, I have a question and I think this is the right Room.
Why do people work on numerical analysis of some unsolved PDEs like the Navier-stokes equations and many others? What are the benefits usually? and what would the theoretical add to that?
Please don't hesitate to share your thoughts or any links/resources where I can learn about these things.
Thank you in advance.
Hmm so I think the Navier-Stokes equations have direct applications to fluid mechanics / mechanical engineering in general
Any improvements would greatly help running these kinds of simulations
Stuff like aircraft and jet engine design
Not sure if you were looking for that
if we knew the analytic solutions to these PDEs on the all domains there would be no point in doing numerical analysis of them, i think the entire point is that we need these PDEs in engineering to design stuff so you have to somehow solve them and since analytic solutions are mostly out of reach you have to resort to numerical methods in order to get at least an approximate solution. But there are also theoretical uses of the numerical analysis, for example there have been papers using numerical solutions together with error estimates in order to prove other estimates (two papers that do something like that are https://ems.press/journals/rmi/articles/10738 and https://arxiv.org/abs/1905.06387)
I assume you mean numerical analysis of numerical solvers of PDE and not "numerical analysis of some unsolved PDEs". Benefits are understanding the behavior of the solvers, optimize the solvers and make them more efficient, create new solvers, show stability, show the need for high precision data types, and so on
In quadratic programming does every facet of neighbouring active set polytopes contain weakly active constraints?
Weakly active constraints as in active constrains whose Lagrange multipliers are 0
For polytope example
Thank you a lot for your answer.
Actually I am considering the fact that analytic solutions to PDEs are out of reach most of the time, but what I meant is trying to find numerical solutions and study their proprieties despite the fact that there is no proof that the PDE has a solution or if it has a solution we don't know if it is unique.
I have heard many times that people are working on the Navier- Stokes equations numerically, but we still lack a proof for the existence nor the uniqueness!
Thank you a lot for your answer.
Actually I meant the other one around. However, your answer was enlightening.
I have worked on developing CFD codes for industrial use (airplane design) and people do it because it is useful, who cares if one could prove the existence of a unique strong solution? People do not expect everything to be exact, it is done for insights and cutting development cost, and general optimization.
or rather, obviously some people care, but that is not the main focus of most people solving navier-stokes
Thank you a lot! yes, that's what I am looking for!
@fathom rain Would this happen at research level? I mean can people do research on solving numerically unsolved PDEs theoretically? Is there any conferences, or articles about this topic? (I expect something like a debate)
@heady hedge It always depends on the department. In engineering fluid dynamics departments itโs more about finding results that are actually applicable while in the math departments itโs about rigorously proving that this and that works and extending the theory
thank you a lot
One last question that emerged to my mind now is:
Is there PDEs that has been proved to be unsolvable theoretically but still people using their numerical solutions? (of course the Numerical solution would be with specific parameters to avoid the problems)
Yes, it is a very big field to solve PDEs that have no analytical solution. much larger that pure PDE analysis.
PDE people like to study weaker forms of solutions if that makes any sense
Can someone explain to me why people care about matrix pencils?
can someone recommend something educational to read on how to solve SDP problems? I know interior point algorithms are common for these types of problems, but I have no clue how to actually implement one and would like to learn more. I've read multiple PDFs on SDP solutions and a book on interior point methods for QPs, but the formulas seems quite different for SDPs and explanations that I've seen weren't great
You should care but you have to understand that these questions qre important to answer
No I don't really care (I don't do CFD anymore), but of course it is important. However, the vast majority of people doing CFD do not work on proving the existence and uniqueness of a solution of NS in 3D.
In industry itโs mostly like: It works and is somewhat robust? Bought
Meanwhile the mathematician having an existential crisis
Shouldn't care* sorry
any computer engineers here?
First thing: you never compute the inverse of a matrix; no matter what type of direct inversion scheme you want to use, you always want to store the matrix as some factorization. For general dense matrices the standard approach is LU factorization.
With that out of the way, the way to answer is more clear: you have to compare the sparse and low-rank structure of your matrix with the sparse and low-rank structure of its LU factors.
For some sparse matrices, the LU factors are not very nice (as you say, it can lead to a dense inverse). This is the issue of "fill-in", where a sparse matrix has not very sparse LU factors.
However, for certain types of structured matrices, especially those that arise from differential equations, the inverse of the matrix can have structure, which means that it can be possible to invert it faster by taking advantage of geometric properties rather than algebraic
Navier-Stokes equations are not completely unsolved. In 2D space, it is well-known that the existence and uniqueness have been established, see Ladyzhenskaya or Temam's book for more details. In 3D space, it is indeed an open problem, and if you look at Tao's recent work, he argues that the Naver-Stokes equations in 3D will blow up in finite time.
From a numerical analysis point of view, of course this matters a lot to how we establish stability/error estimates for our numerical schemes. We usually avoid doing theoretical studies in 3D since there is no established answer yet for the well-posedness of the problem. In 2D, even if we already have established results in the continuous level, when we discretize the Navier-Stokes equations using our numerical scheme, we still have to prove that our numerical scheme mimics the estimates and physical conservation laws that we have in the continuous level. This is not a trivial issue, and thats why we see a lot of numerical schemes that produce solutions that disobey the physics or they obey they physics but it takes forever to compute.
what's CG?
Thanks
Do u use slope limiters? If yes, then its sensible to check for conservation since some slope limiters are not conservative. If not, then its more about verifying that your implementation is correct. Usually you also need to compute numerical convergence rates and check whether they match with the theory.
I would think he's asking you to check the mass of your final numerically computed solution, and compare that against the mass of your initial condition. So you're not discretizing any PDE, you're just checking two vectors. So the point would be to check that your numerical implementation obeys the theoretically desired property.
Why thank you or why ask?
Well to look it up and figure out something about whatโs going on ๐
Think about this more abstractly. Your solution is a discretization of a function. What is that function and what physical quantity does it represent? And if you knew what the exact function was, how would you compute the mass of the system?
In particular I'm trying to answer under the assumption that I do not know what equation you're solving, nor what algorithm you're using to solve it.
He wants you see it for yourself because some numerical methods are not conservative
When someone says that FEM is globally conservative it means that if I calculate the l2 norm of my velocity field on the entire domain it should be zero but on a local element it might not be right? With FVM one gets both global and local conservation so the fluxes on each cell equate to zero too right?
Thank you very much for such a nice answer.
Can one understand from your words, that even if the equation blow up in finite time people still can use it in industry? and it can give efficient results?
How do I go from (\lvert\Tilde{u}{h}-u\rvert\leq Ch^{p}), where (\Tilde{u}{h}) is an approximation to the exact value (u), to (\Tilde{u}{h}-u=Dh^{p}+O(h^{p+1})) where (D) is some constant provided (\Tilde{u}{h}) depends smoothly on (h). Basically: how do I show that (\lvert f(h)\rvert\leq Ch^{p}) is equivalent to (f(h)=Dh^{p}+O(h^{p+1})) given that (f) is sufficiently smooth?
Victor H
Can someone help me understand how this is true?
I know a general (s)-stage RK method can be written like:
[
\vec{y}{n+1}=\vec{y}n+h\sum{i=1}^{s}b{i}\vec{k}_i
]
but I don't understand how they arrived at (\vec{k}i=AY{i})
Victor H
And henceforth how Y_i is defined
I think I managed
I need some help trying to numerically solve a problem of the following form
$$ min_{x \in X} max_{y \in Y} f(x,y) $$
Where $X = (\mathbb{S}^{r-1})^m$ and $Y = \mathbb{S}^{r-1} \times \mathbb{S}^{d}$.
My function $f$ is differentiable and even affine with respect to x (though idk if that helps)
I was thinking about simultaneous constrained gradient descent, though idk much about it and how it converges. Since my domains arent convex, but still do have some structure there may be better ways to do this, any suggestions?
Dr. J. Stockfish
also, if anyone has suggestions for the same case except $Y = \mathbb{S}^{d}$, feel free to suggest as well, though the former case is more important to me
Dr. J. Stockfish
if F is affine with respect to x, then given a fixed y it seems possible to minimize F(x,y), even over a sphere, with something that is analytic or maybe pseudoanalytic
one can consider casting the problem in a higher-dimensional spherical coordinates
you enforce periodicity conditions but now the problem lives on something that is hopefully convex?
well if you fix y then yeah the problem becomes easier, but I don't quite get how to go from that to my case in a computationally reasonable way. The order of the problem is that you need to maximize f first for a fixed x, and f is not affine in terms of y, (but it is quadratic/ qubic in terms of y)
it might be better if I texit here hold on
$$\min_{(x_1, \ldots, x_m) \in X} \max_{(v,w) \in Y} \sum_{i=1}^m \langle x_i,v \rangle wH_i w^T$$
Where $H_i$ are real $d \times d$ matrices.
Dr. J. Stockfish
you can also shorten the problem by writing
$$\min_{(x_1, \ldots, x_m) \in X} \max_{v \in \mathbb{S}^{r-1}} || \sum_{i=1}^m \langle x_i,v \rangle H_i||_2$$
Dr. J. Stockfish
where the 2 norm here is the spectral norm, the problem with the spectral norm is just that it has differentiability problems and just behaves weirdly in general
would it be a bad idea if I enlarged my domain to be everything except zero and just normalize every term I have in my expression? then we'd have almost a convex domain at least.
My thinking was that you could reduce the problem down to optimization only over y, if you could express optimal x in terms of y. But perhaps that is not possible now that you've given this expression, I'd have to look at it once I'm at home
but the order of the max and the min would require me to first find an optimal y in terms of x. Then I would have to minimize, that makes the problem a bit more difficult since I dont see an obvious analytic solution
Ah you're right this is my mistake. I'll give it a closer look later
if m = 1, then i think the maximiser for the inner optimization problem would be the normalized projection of x_1 onto R^{r-1} right?
Yeah if m=1 the maximization is just picking a unit vector in the span generated by x_1
Hey, does anyone know much about heap data structures? Specifically binomial heaps?
I'm trying to learn about this data structure and I can't figure out if the heap data structure allows arbitrary functions to be applied to the nodes in the heap (or nodes which satisfy a given property of that heap e.g. all nodes in the tree of a given order). As far as I can tell, the only functions that can be applied to the heap are merge, insert, find min/max, delete min/max, decrease key, and delete. Does the heap have to be rebuilt after every function application or can it wait until all applications have been made? Some pointers or corrections would be much appreciated (I'm not a CS person, the most maths I ever did was a maths major in my undergrad)
I'm trying to learn about this data structure and I can't figure out if the heap data structure allows arbitrary functions to be applied to the nodes in the heap (or nodes which satisfy a given property of that heap e.g. all nodes in the tree of a given order).
If I understand you correctly, Generally no though you can code in a solution for this/find a customized solution
As far as I can tell, the only functions that can be applied to the heap are merge, insert, find min/max, delete min/max, decrease key, and delete.
These are the function that are the most important to measure from a time complexity perspective since they define a heap.
Does the heap have to be rebuilt after every function application or can it wait until all applications have been made?
Depends on the implementation but for most the function applications you mentioned, (insert, pop, etc.) you do not need to rebuild the heap as that would be quite inefficient
Hi where can i ask questions about optimization?
This is the place
So i am mega confused about how to do question 7c
First time doing optimization stuff
What have you tried, and what are you stuck on exactly?
Okay so i have tried computing a few iterates using the steepsst descent method
But i get stuck because i dont know whats an appropriate choice of alpha
Or how to find it
One approach is to minimize (over $\alpha$) $g(x_1)$ subject to $x_0 - \alpha\nabla g(x_0)$
My textbook says that the value that minimizes alpha is 1/L
kirby
Where L is the lipschitz constant
That works as well
Alright ill try that
One tiny question though
Theyve given me an initial value for x of x0 = 0
Heres what ive done for it
Yeah, so the initial value of x is just the 0 vector
I think it might have just been easier to have chosen 1/L or something along those lines, but this should still be ok
Iโm silly and leading you in the wrong direction.
Going back a few steps, you have [x_1 = \alpha\frac{2}{n} A^Tb]
Now you need to find the value $\alpha$ that minimizes $\frac{1}{N}|Ax_1-b|$, but I realize now that because you donโt really have any information, Iโm leading you in circles. The way your textbook says to use the lipschitz constant is probably what is expected
kirby
In an earlier part of the question that is exactly what we had to do
Find the lipschitz constant
For the gradient of the system
Yes, so you just need to choose a step size sufficiently small, which just means $\alpha \le \frac{2}{L}$
My apologies for making that more complicated than it shouldโve been
kirby
Hey nw, thanks for the assistance
So to put it together, how do i know how many iterations are required, since ive computed one and gotten what i showed above
I just mean to ask how do i know when to stop
Yeah so what you need to do is stop when you have error less than $\varepsilon$, which is fixed.
kirby
So I believe what itโs asking is to let eps > 0 and find the number of steps in terms of epsilon if that makes sense
No worries
So it should make sense that your functional is minimized when Ax=b, correct?
Yes
The idea is to use steepest descent and find the number of steps such that [\frac{1}{N}|Ax-b|^2\le \varepsilon.]
Now what does this mean? Say I give you a number $\varepsilon > 0$, your job is to tell me how many steps for the functional to get below that value
kirby
You can let the steepest descent run as long as youโll like, and youโll have some error usually, but you get closer and closer to the real minimizer
so since I didn't get much attention in advanced analysis I'll ask here since my question is also quite applied
does anybody know what's the reasoning behind constructing a dual program when you have primal program with cone constraint? I'm seeing that if I have a linear function, linear equality constraint and cone constraint (x must be in convex cone K) then the dual program is very similar to what you'd get by constructing lagrangian except that I don't see how could one derive from such a general program that exactly this kind of expression must lie inside the dual cone K*
$\left{\begin{matrix} c^Tx \rightarrow \underset{x \in K}{\min} \ s. t. \ Ax = b \end{matrix}\right. \ \left{\begin{matrix} b^Ty \rightarrow \underset{y}{\max} \ s. t. \ c-A^Ty \in K^* \end{matrix} \right.$
If we have $s = c - A^Ty \in K^*$ then by definition of dual cone we get that $s^Tx \geq 0$, but if $s^Tx > 0$ doesn't that mean that lagrangian $L(x,y) = (c-A^Ty)^Tx + b^Ty$ is unbounded below in $x$?
so in a sense the dual seems to be well defined on the boundary of the dual cone, but we still allow the interior points as well for some reason?
Transparent_Elemental
Hi i dont wanna hijack the thread
But i have question to ask
Im stuck on this very small part of q5
I found the lipschitz constant
And i have to rearrange the equation in the way theyve shown in 5
But the inequality is the other way round
i think your convexity inequality f(y) <= f(x) + grad(f)(x)(x - y) + ... should be in the opposite direction
Damn i think youre right
Does it make sense if its the other way round?
Does it still
Hey guys quick question
How do i show that a function is strongly convex, i know that one part of it is convex
From previous parts i have that f(x) is convex
Do you have the definition of strongly convex functions?
Cause it should just follow from that
I looked at the definitions i still dont get it
We know that ||x||^2 is strongly convex
$||x||^2 is strongly convex
$||x||^2 $is strongly convex
What's the definition of strongly convex that you're using
Alright i know what to do now actually
But i dont know how to prove that the euclidean norm of x^2 is strongly convex
Very late response, but just take the second derivative, I think it's called the frechet derivative
Any good recommendations for getting into computational complexity theory and computability theory?
Sipser
Hello
does anyone know how to solve this? please help!
Suppose a population satisfies a logistic model with r = 0.4, K = 100, x(0) = 5.
Find the population size for t = 10.
elaine rich's automata book
sipser is the classic though
actually its for a friend of mine who's really stressing out about it
so he probably knows the logistic function
do you know how to solve it?
okay
@long lake the solution hes getting isnt correct
but hes plugging in the variables
i have the equation right here
If I were to plug in the variables I would get around 74.18
Can you tell him to ask the question here
It will be easier
Than having a middle-person
ohh okay
yeah using this one would be easiest, lemme check your calculation
The answer is 75
According to the book
@long lake appreciate you โค๏ธ
no problem!
Hi could anyone help me with a Math Bio Question?
The Pacific halibut fishery is modeled by the logistic equation with carrying capacity 80.5ร106 , measured in kilograms, and intrinsic growth rate 0.71 per year. If the initial biomass is one-fourth the carrying capacity, find the biomass one year later and the time required for the biomass to grow to half the carrying capacity
I used this formula
<@&286206848099549185>
Hi guys
So im trying to prove that a strongly convex function has a unique minimizer
I know how to prove it for strict convexity
But how do i prove it for strong
Or can i just say strong convexity implies strict convexity yada yada
Is your function lower semi continuous?
Iโm assuming that to be the case at least
Can you say strong convexity implies strict convexity?
Yes
Okay nvm
I think if you can show that your function is coercive and strictly convex, you should be good to go from there
Thanks
Hey guys
I was able to do parts a and b
But c really eludes me
The hint is supposed to make this easy but its really not intuitive to me still
@minor osprey any ideas?
Could you post the descent lemma?
Armijos rule is a line search method
To bound the step size
But im still not understanding how to prove this part c
Sorry, was in class
Iโll take a look
I would take the hint ofc and assume the inequality is false, then modify the inequality. I would then look at perhaps bounding your left hand side from below using the descent lemma? A lot of this is unsure, but thatโs a first glance
That looks good so far, now notice that $\alpha_n = \beta^{m_n}a$
kirby
Thanks i was worried
Okay so now im kind of stuck here
How do i arrive at a contradiction?
@minor osprey
Now plug sigma into the original inequality
I'd play with it a little bit too see why it's a contradiction
By original inequality you mean the descent lemma
Yeah, you're assuming that the inequality you're given is false
Is this still the right direction? @minor osprey
Are there any mathematical physics people here? in the euler lagrange equation you have a time derivative of a multivariable function f(t, x(t), y(t)) i.e the lagrangian
that d/dt
so let's say you have $f(t, x(t), y(t)) \ f : \mbb{R}^3 -> \mbb{R}$ how are $\frac{\partial f}{\partial t}$ and $\frac{df}{dt}$ different
Neamesis
I know this should technically go in #multivariable-calculus but I wanted to ask people who were studying both math and physics since apparently the notation isn't used in math and only used by physicists
i'm basically asking how are you changing t, without changing x(t), y(t) in the partial case
which notation is supposedly not used by mathematicians?
in the partial case you treat the x and y as independent of t, so you only differentiate with respect to the argument t of f. So the partial derivative gives you the rate of change if you keep every argument of f constant except t
but how can you keep x(t) and y(t) constant when you're changing t?
notice that f(t, x(t), y(t)) is only really a function of time
doesn't really make sense to partially differentiate it if x and y explicitly vary with time
which is what the notation x(t) and y(t) suggests
now if you write it as $f(t, x, y)$, then partially differentiating makes sense
jacob
yea i understand, but ${{\partial}_1 f}$ does make sense because what you're doing is computing the partial w.r.t the first variable and then evaluating that derivative at the point (t, x(t), y(t)) that's what I meant by $\frac{\partial f}{\partial t}$ being bad notation, because it's ambiguous as to what it means
Neamesis
which is what the partials in the euler lagrange equation mean i guess
i.g. something more explicit would be:
$\frac{\partial\mathcal{L}}{\partial q}\bigg|{(t, x(t), y(t))} = \left(\frac{\mathrm{d}}{\mathrm{d}t}\frac{\partial \mathcal{L}}{\partial\dot q}\right)\bigg|{(t, x(t), y(t))}$
jacob
writing that as $\frac{\partial\mathcal{L}}{\partial q} = \frac{\mathrm{d}}{\mathrm{d}t}\frac{\partial\mathcal{L}}{\partial\dot q}$ doesn't seem too sinful to me notation-wise
jacob
yes
Numerical analysis :
hello, I need to prove that if $\displaystyle \limsup_{n\to \infty} \frac{|x_{n+1} - x^|}{|x_n - x^|} \in [0,1[$ implies that $x_n$ converge at order 1, i know that i need to get out that lim sup but i don't have any idea on how to do that, someone can help me ?
Shinke
Im not really sure but perhaps you could show that lim sup <= lim inf, this would proof the existence of the limit
sure but i know that a similar proposition ignore the lim sup and suppose that lim exist. if limsup exist and in [0,1[, the gap between elements of x_n and x* is just converging to 0 ?
Can someone explain how the big O is simplified here?
Like I mentioned in pdes, what is the scheme you're using?
this sheme
๐
I thought the O(t) would be O(t^4) because of the division by this term for t
instead of O(t^3)
The 1/24 \Delta t^4 u_{tttt} terms cancel out upon subtraction
Is that the reason we then have O(t^4), and that is then divided by the term 1/t?
giving O(t^3)?
But I thought that if you divide O(t^4)/t=O(t^3)?
Well, O(t^3) isn't wrong, but it's the same reason why there's no \Delta t terms
What I'm getting at is that I would suspect that you would get O(t^4) here essentially because the O(t^3) terms vanish
Thanks, but that leaves me to the next question for O(x^4)
If we follow this logic O(x) should be O(x^3) shouldn't it?
Because you end up leaving the uxxxxx terms leaving O(h^6), this is then divided by h^3 giving O(h^3)
instead of O(h^4)
I'm reffering to this term
It's weird because this one actually was worked out correctly unlike the time derivative terms. If you were to calculate the O(h^6) terms, you would establish that they vanish like the O(t^4) terms did in the time derivative
I didn't understand. Is there a mistake according to you in the h part?
Nono, there is no mistake in the h part. The O(h^6) terms vanish when you apply it to the scheme
My doubt was why there is an O(h^4) instead of an O(h^3). Was my logic from before correct?
Yes, your logic is correct
However, it's the same reason why I mention that you should have an O(t^4) there
Could you explain this? I didn't understand this then
no prob
Okay, I'm going to look at the time derivative terms (they'll work out the same for the spatial derivatives)
[u_j^{n\pm1} = u_j^n \pm \Delta u_t + \frac{1}{2}\Delta t^2u_{tt} \pm \frac{1}{6}\Delta t^3 u_{ttt} + \frac{1}{24}\Delta t^4 u_{tttt} + \mathcal{O}(\Delta t^5)]
Now, we want to fill these into the numerical scheme
[\frac{u_j^{n+1} - u_j^{n-1}}{2\Delta t}]
We then establish
\begin{align*}
\frac{u_j^n + \Delta u_t + \frac{1}{2}\Delta t^2u_{tt}+ \frac{1}{6}\Delta t^3 u_{ttt} + \frac{1}{24}\Delta t^4 u_{tttt} + \mathcal{O}(\Delta t^5) - \left( u_j^n - \Delta u_t + \frac{1}{2}\Delta t^2u_{tt}-\frac{1}{6}\Delta t^3 u_{ttt} + \frac{1}{24}\Delta t^4 u_{tttt} +\mathcal{O}(\Delta t^5)\right)}{2\Delta t} \
\frac{u_j^n - u_j^n + \Delta u_t +\Delta u_t+ \frac{1}{2}\Delta t^2u_{tt} - \frac{1}{2}\Delta t^2u_{tt}+ \frac{1}{6}\Delta t^3 u_{ttt} + \frac{1}{6}\Delta t^3 u_{ttt} + \frac{1}{24}\Delta t^4 u_{tttt} - \frac{1}{24}\Delta t^4 u_{tttt} + \mathcal{O}(\Delta t^5)}{2\Delta t} \
\frac{2\Delta u_t + \frac{1}{3}\Delta t^3 u_{ttt} + \mathcal{O}(\Delta t^5)}{2\Delta t} \
2u_t + \frac{1}{6}\Delta t^2 u_{ttt} + \mathcal{O}(\Delta t^4)
\end{align*}
kirby
So that's the process for deriving the terms in the picture you sent
You can do this for the spatial derivatives to determine why O(h^4) is correct (this requires expansion of the 6th order terms or faith in noticing that the terms are symmetric for even orders)
Thanks a lot for your answer. But I now have a question about the logic, because you end up with O(t^4) instead of O(t^3)