#numerical-analysis
1 messages · Page 18 of 1
in my context, A is nonsingular but poorly-conditioned
A is nonsingular but poorly-conditioned so A*A is positive definite but even worse conditioned
We slightly perturb this with alpha*I to make it more diagonally dominant
Making a matrix more diagonally dominant makes it better conditioned, see the Gershgorin circle theorem
i understood that much but it feels weird to me how the addition of αI does not fuck up the solution
like, wouldn't there be some error introduced by this +αI
alpha is small
so i can expect the error to be on the same order of magnitude as alpha?
You pick it to minimize the tradeoff between round-off errors you get from the poorly conditioned A*A and the perturbation error
As alpha gets larger, A*A+aI becomes better conditioned and round-off errors decrease
As alpha gets larger, the perturbation error increases
im looking for something quantitative here
If $\hat{x}$ is the solution to $Ax=b$ it is also the solution to $A^*Ax=A^*b$. Then if $x_0$ is the solution to $(A^*A+\alpha I)x=A^*b$ then you can calculate the error $\norm{\hat{x}-x_0}_2=\norm{(A^*A)^{-1}A^*b-(A^*A+\alpha I)^{-1}A^*b}_2$
Angetenar
This is the perturbation error
This can be simplified
If the condition number of $A$ is $\kappa$, then the condition number of $A^*A$ is $\kappa^2$. $\log{\kappa}$ is approximately how many digits of accuracy you lose compared to machine precision, so you lose twice as many digits of accuracy
Angetenar
This is not exact, but a bound
Making a matrix more diagonally dominant makes it better conditioned
I was thinking about this. @Angetenar#4727 do you suggest any references about any of this?
@wide spear
a bound is okay
||I found this in Vargas' Matrix Iterative Analysis ||
I may also recommend Demmel’s book on numerical linear algebra
Linearity of expectation?
The very bottom line right?
What are you trying to figure out about it?
hitbox
Why $\frac{\gamma}{1-\gamma}$ and not just $\frac{1}{1-\gamma}$?
Oh nevermind I see ok
Well, I can find a 1/(1-gamma)
Good luck with the rest
You have $\sum_{t=0}^{\infty}\gamma^t=\frac1{1-\gamma}$ which you can then bring out of the expectation
Angetenar
Oh rip
Conway polynomial I think? Based on digging through documentation
@dapper cargo can you tell me if this is the official sagemath documentation? https://doc.sagemath.org/html/en/reference/finite_rings/sage/rings/finite_rings/finite_field_constructor.html
I just don't see where it gives any explanation of the arguments it just gives a bunch of examples lmao wtf
I assume the first argument is the order and the second is the name I guess
Oh yeah this is what I was looking at
yeah looks like it probably gives you a root of this polynomial, well in this case the generator of this field is obvious
yeah gives you a generator apparently of that polynomial just playing around to see for myself
yeah of the multiplicative group
Hio frands. I hev question on data fit to equations, signal decomposition and why wolfram gives me poop outputs.
So I have some signal data. To the best of my knowledge the data seems to fit some sort of weird distorted sigmoid curve.
so currently Im on Desmos basically banging on equations to try and get a reasonable fit to the signal i've got.
but this is suuuuuuuper annoying and time consuming.
Im wondering if this has already been automated...
Basically I want to supply:
- A generalized form of an equation
- A coordinate domain within that equation
- A number of data points
Then have the machines churn away and give me the best fit for the equation from #1 and give me that modified version back.
How generalized is the form of the equation? Is it like f(x)=c_1x+c_2x^2+c_3/x or is it like f is a polynomial
Also, I think you would want to specify how you are measuring the best fit
Are you computing the L1 error norm? L2? L infinity?
re: best fit... no idea how to respond to this bc im too newb
Are you familiar with linear regression?
I've kind of come up with semi-fitting equations so like its kind of like ... I've simplified the search space manually but then I want the tool to do all of the fine tuning for best fit
yes semi-familiar w lin reg
Ok so in linear regression you minimize the sum of the distances squared right?
yeh
This is the 2-norm error
got it.
You might also minimize with respect to the direct sum of the distances, without squares
This is the 1-norm
so L2 error is '2-norm' error?
Yes
You might also minimize with respect to the greatest distance between a regressed point and the data value
so where does R^2 fit in this ?
Ok so L2, is 2-norm is R^2
But being in R^2 doesn't really matter
Oh duh.. because distance squared
Essentially what you will want to do is set this up as an optimization problem
yup.
Minimize error subject to some constraints
The particular form will depend on your problem and the specifics
so first question: This seems like something that wolfram alpha would do.
But the error should be differentiable so you can use gradient based optimization methods
You'll want to code this up yourself
meh.
So that you can have more control
thats the opposite of what I want to do lol
i want less work
LIke i've already fitted a kind of bi-equation fit which is close
but like I know computation could get closer.
Coding it will get you the best result
The trade off is between how much work you want to do and how good you want the fit to be
ya.
and ive already like put waaaaaaay too much work into this over the past few days
I think ima ask the client "do you want more accurate"
then we need to do math modelling for fit
PAY ME MORE BREH
^ actual convo
@wide spear is there a name for this kind of stuff?
like ... computationally searching variants of an equation for best fit?
"regression fitting" lol
Just curve fitting
There are tons of curve fitting libraries.
I personally have experience with Python's scipy.optimize.least_squares method.
This should solve your problem
Curve fit also works, although least_squares is slightly more general.
I think curve fit builds on least squares
Yes
Generically, this problem is "convex optimization"
Somewhat more specifically, "curve fitting"
I think from a "I want to code as little as possible" perspective, curve fit might be better
Probably.
The basic process here is a) Write a method that takes some number of arguments and outputs a result. The argument here are the parameters that go into your equation model. b) Pass the dataset and this equation method to curve_fit, and let it go to work. It will, hopefully, return the best-fitting set of parameters it could find.
So, for a parabolic fit, for example, your method for part a might look like:
def parabolic_fit(x, a, b, c):
return a*x**2 + b*x + c
x is the placeholder for your input data. a, b, and c are the fitting parameters.
@autumn crow would you approximate the scipy method to be more accurate than wolfram alphas website?
I mean W|A will not do this
I haven't used W|A for curve fitting.
just too computationally expensive?
I mean W|A has limited functionality to get people to buy Mathematica
I'm sure you can do this in Mathematica as well
I don't know how you'd make it do the sort of generalized curve-fitting you're looking for. It can probably do polynomial, exponential and log fits, but not much else.
idk, if you ever got an answer to your earlier question, btw, about why it's always e^ and ln in fits.
snatch seems to have an equation form they are trying to fit to
Ohhh actually easy question: I have this format of equation:
-y=1/x
what is that called?
It's because the base doesn't matter. You can always change base to whatever you want and it will just change the coefficients.
Negative reciprocal?
This is a rational equation
like its not a log, its not exponential, its got two asymptotes
I mean like the phrase describing that shape
sec ...
Not every function type has a name.
Yes, I know what it looks like.
Rectangular hyperbola?
its possible im mis-remembering
Canned curve-fitting routines like those found in Excel will not fit those kinds of functions.
or just like assuming it has a name bc I saw it around the same time when I was learning about exponentials / logs / hyperbolas/ quadratics
It might be a hyperbola.
Yeah @autumn crow i learned this the hard way
Honestly, I don't remember what defines a hyperbola.
Err. No.
wait maybe youre right
The reciprocal definitely -looks- similar to a hyperbola.

anyways... back to the hard stuff
But it might not be, technically. Like I said, I don't remember the formal definition
Ok so again EASIEST POSSIBLE SOLUTION...
Anyways this should be in #precalculus or something
with the scipy approach
I want to define some coordinate subset area of an equation.
actually eff.
maybe I should just do this before makign it more complex
~WEHHHH
curve_fit has a bounds keyword, that you can use to limit the search space.
Oh wow.
its like people have walked this path before me.
TY @autumn crow
and @wide spear

This is a very common thing to do.
It's a large amount of applied math
Convex optimization is a field all its own.
Convex and convex-adjacent optimization is giant
Oh cool.
"Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets."
Like i understand that sequence of words
What is your question
perhaps someone can give me some input here
i'm simulating a scalar field using neumann series
i have something of the form $x \in \mathbb{C}^n, \gamma \in \mathbb{C}^n, G \in \mathbb{C}^{n \times n}, A = G \cdot \text{diag}(\gamma) \in \mathbb{C}^{n \times n}$
Edd
and there's the truncated neumann series i'm evaluating
which is of the form $y = \sum_{k=0}^K A^k x$
Edd
the thing is, n here is huge. that matrix A is about 10GB. i'm aware the gamma can be multiplied as a hadamard product with x to save some memory and time. the matrix G is also symmetric two level block toeplitz, so i have split it up into only the unique n entries
still, evaluating this is taking upwards of 20 minutes. what might be a good way to go about it?
btw i'm aware i could use some sort of iterative regularized least squares approach, but if it takes more than 5 gradient descent steps, it'll already be slower since i'm taking K = 5
not that i am good at computational linear algebra 🙂 but maybe i can poke at the problem with you, what does your code look like/how much RAM do you have/are you interested in multithreaded solutions?
i have 16 gigs of ram, but i only need about C*sqrt(size of matrix) of ram to iteratively yield the results (for some small constant C. for this example, i end up needing about 1MB of ram for an equivalent 10GB matrix)
since i don't especially need the matrices, only their products with vectors (and later on a good guess of their eigenvalues)
at the moment, if we let A be of size M^2 x M^2, i store (not in memory) the unique M x M samples that define it (two level symmetric block toeplitz) and then use indexing operations to produce one 1 x M^2 row at a time and build products with vectors from that
i additionally check for the nonzero elements of the vector and use that to select only the corresponding columns of the 1 x M^2 row
regarding multithreading, for the time being i'm relying on numpy knowing what it's doing, but later on i'll move this stuff to a cluster
whats the code look like?
i can share that much
one iteration of the algorithm involves multiplying 1000 of those, each one of size 100^2 by 100^2 (only needs memory of size 100 x 100)
so of course an easy way out is to parallelize over the 1000 matrices, but i can't do that for now since i don't yet have access to a cluster
are you then computing sum(k=0 to K) A^k x as x + A(x + A(x + ...))?
hm, it's getting late for me, but i'll play around with this a bit tomorrow
sure thing, thanks for the interest
i guess overall it comes down to the implementation of generator but if this thing is just copying some memory around then i'm not sure what kind of optimizations there are to make. i wonder if the best gains to be made would be from reducing memory copies (switching to c or smth) and using threads if numpy is not already doing this
it might be helpful to look at the code for numpy itself to see exactly what its doing
i think numpy uses fortran libraries but i could be wrong
yeah, afaik numpy falls back on c and fortran
for completeness, generator computes hankel functions. that is admittedly kinda slow, but there are optimized implementations of bessel functions, so i used those. furthermore, those generator matrices do stay fixed throughout the whole algorithm, so as you send, it's just reading from storage into memory
Numpy uses BLAS/LAPACK so that won’t be an issue
What do you know about the matrix A?
If A is toeplitz you can compute A*x using a FFT and iFFT
This might be faster for a single processor but FFTs have well known issues with parallel scaling
Alternatively, because you can exactly determine what A is from these n entries and a bit of math, you can implement a custom matrix multiplication
Much of the slowness of mat mul comes from communication costs of matrix entries to/from cache and n entries is small enough to store in cache
So you can cook up something in C or Fortran or whatever where you manually manage what happens in cache and ram and what not
the individual blocks of A are toeplitz, so that could speed up the computation with FFTs, you're right
You can pad A to be completely toeplitz
Then when you do the computation you can keep track of which entries to ignore
hmmm i say that, but even though they are symmetric toeplitz, it doesn't have the structure of a usual convolution, nor is it circulant
What do you mean?
the columns can be thought of as windowed chunks of signal, i would have to pad the hell out of the vectors
otherwise there'd be crazy aliasing
Is it sparse?
no
Rip

Ummm, if the matrix is block toeplitz, you can recursively break down a matrix multiply until you get to the point where everything is toeplitz
hmmm
Reading from storage into memory is the slowest part of the whole process
You need to think about communication optimization and communication avoiding algorithms
If A only has n entries, you should be able to store it in cache
A is an m x m matrix of m x m matrices, i never have the full A at a time
Oh ok
i take m samples at a time and unroll them
Then you can work with each sub-matrix at a time
yep, that's how it is atm
Ok and you should be able to store the m entries in cache right?
yeah
x is length m^2?
It may not fit in cache
Think about how to optimally access entries of x if that's an issue
Well if everything fits in cache, there isn't much speed up to be done
yeah about 300MB is fine
By cache I mean like the processor's L1/2/3 cache and not in ram
i'll look into it, thanks for the input
Off to math help you go
How much do you know about your sparsity structure?
How simple do you want it to be?
Obviously the simplest thing to do would be to ignore that the matrix is sparse and just do it naively
But this would be very bad right
Ok
What is your starting point?
Do you have dense LU code written?
Have a look at these slides
I'm not familiar with DAE solvers, I'm assuming the resultant matrix is sparse but is it also banded? It could be potentially easier to write a sparse banded solver in this case.
I think they’re assuming no structure beyond sparsity
makes sense, disregard my comment above then
is there a way to rewrite this computation so it uses fewer instructions/runs faster?
it's not c but it the language uses c syntax and i assume compiles like normal c
maybe pushing in the n_2 into the square root to avoid a division? but i'm not sure how much faster this will be, i would guess that sqrt and all the other operations are already enough cycles that this will not matter too much. i'm assuming that sqrt/pow/dot get inlined, and that dot is using simd
cool. I just wanted to make sure there wasn't some big improvement i was missing, since this has to run quite a few times
out of curiousity, what language is it?
so it has to compute for every ray that hits an object from a raytracing algorithm(blender cycles)
Change the multiplication by 2 to a bitwise shift
i could also replace 1 - dot(i, n)^2 with | |cross(i, n) ||
but idk if that would be faster
Of course a dot of this will depend on specific code implementations
But also have faith that arithmetic is optimized
For example, on Intel systems, most arithmetic should call the Intel MKL library
They optimize so you don't have to
i can also implement the dot/cross products outside the code itself
something is wrong with this code for gauss-seidel
here's the algorithm
I simplified my code below, still doesn't converge
tried different initial values, including the solution
Pick a linear system and hand calculate the first iteration
fwiw, i ported this to python, and it seems to work/converge on random diagonally dominant matrices
~ $ cat discord.py
import numpy as np
def gauss_siedel(A, b, N):
n = len(A)
x = np.zeros(n)
result = np.empty((n,N))
for k in range(N):
for i in range(n):
tot = 0
for j in range(i):
tot = tot + A[i,j]*x[j]
for j in range(i+1, n):
tot = tot + A[i,j]*x[j]
x[i] = (1/A[i,i])*(b[i]-tot)
result[:,k] = x
return result
n = 10
N = 100
# make it strictly diagonally dominant
A = np.random.random((n,n)) + 10*np.eye(n)
b = 5*np.random.random(n)
result = gauss_siedel(A, b, N)
print(result.T)
print(np.allclose(np.matmul(A,result[:,-1]), b))
~ $
It should work for SDD still right
I'm not expecting it to work for all matrix, like do you mean saying this that there's a mistake in the code?
Yes it should work for strictly diagonally dominant matrix
The best way to check if it converges is to see if it's giving you the numbers it should
If it doesn't, hand calculating can tell you which numbers are wrong
But 8da seemed to have fine results with the code?
thx for checking
Our hw asks us to check if it converges for this
like it's supposed to
i mean idk but I think it's supposed to
i am also self studying numerical linear algebra right now, so this is a good exercise for me 🙂
otherwise it wouldn't ask for the 2-norm error
This matrix is very much not strictly diagonally dominant
Fortunately, this does have solutions
also, i think the largest eigenvalue here of (D+L)^(-1)*U is about 478, although i am not familiar enough with iterative methods to know what is the nature of regions of the initial value that can converge when spectral radius is greater than 1
Anyways Gauss-Seidel sucks
If the matrix is symmetric positive definite you want to use conjugate gradient
And successive over-relaxation is just gauss-seidel but better
yea imma try successive over relaxation also
If your SOR code converges then just set w=1 in that and you get GS
Anyways for a strictly diagonally dominant matrix I don't know why you wouldn't just LU it
wait 8da mind if you try gauss seidel on this to see whether it converges?
A = [0.4218 0.6557 0.6787 0.6555;
0.9157 0.0357 0.7577 0.1712;
0.7922 0.8491 0.7431 0.7060;
0.9595 0.9340 0.3922 0.0318];
B = [0.2769;
0.0462;
0.0971;
0.8235];
cuz someone said it does, but i found it doesnt
yeah, it didn't converge for me
i see
well, maybe it depends on the initial value
I tried initial value = to solution
and it still diverges 
hopefully it's not an implementation mistake, thanks for checking!
fwiw i just tried setting it to the solution as well, and it didn't converge for me again

in implementation jacobi is so similar to GS right
the only difference is in jacobi you make a copy y, and then after the iteration assign x=y
while GS you just modify x live
i guess so, my background in these methods is having speed-read some lecture notes, and "convincing" myself i understood it, but now going through an exercise, i find i am confused
maybe tiny numerical differences from setting the initial value to be the "solution" causes blow up
ah
SOR doesn't converge too I tried like 20 w's, maybe my prof just didn't try this himself lol
now I'm wondering if there's a way to minorly adjust the matrix so that it's SDD or maybe so that they work..
i guess what should happen is, when we set the initial value to the "solution", if the error between the "solution" and the true solution does not have a big component that is along an eigenvector of (D+L)^(-1)*U whose corresponding eigenvalue is big, then it should converge? ignoring floating point issues
i have no idea, will have to look back on slides... maybe angetenar knows
question repub from #math-discussion
i have a matlab related question: for some basic signal processing (LTI signals, Fourier Series/Transform, Filters eg), would I need to download a specific toolbox?
You can take a look at the Signal Processing Toolbox. I'm not familiar with it myself, though.
ty
@fleet sail @brave crypt what was your question?
Also @gentle grail a lot of the basic things you'll use for signal processing won't require a specific toolbox
Like all of the Fourier transforms
so i realized SOR in general converges way slower
for larger matrices like for 4x4 symmetric positive definite usually 10-20 iterations stabilize to 4d.p but for like 100x100 it takes thousands
why is that?
Well, it depends on how you pick omega, the relaxation parameter
i mean for GS as well
in theory should converge right, so its just the rate
but idk why for a random 10x10 it's way slower than 4x4
like n. iterations
Probably something about the condition number
wait wdym
Do you know what a condition number is?
no
It's the ratio of the largest and smallest singular values
The larger it is, the slower iterative methods converge
And I think a random 10x10 matrix will on average have a larger condition number than a random 4x4 matrix
I see
The condition number is closely tied to the spectral radius
damn lol why am I using this method if i can just gauss elim it then
For example, when you are solving the discretized version of Poisson's equation, you end up with a tri-diagonal matrix that is 4 on the diagonals and -1 on the off-diagonals
This is a super well-conditioned matrix, every iterative method converges
However, there is an iterative method called multigrid which solves this in O(nlog(n)) time
That's absolutely bonkers
Most things in NLA have a min of O(n^2) because that's the time you need to read a matrix
Multigrid goes faster than that
However, it has very specific use cases
Namely, it can only be used on elliptic pdes and matrices coming from those
wait how about storing in CSR format
is that what you mean by multigrid or
cuz for sparse matrix wouldnt that be how you store and so you take O(n) or less to read matrix
I mean sure using a sparse format to store the matrix is fine but that doesn't really fix the O(n^3) for Gaussian elimination
Oh I see so your talking about iterative methods to solve
Even something like conjugate gradient
It gives the exact solution in O(n^3) time but it converges extremely fast
It's also quite difficult to understand
spectral radius is equivalent to lipschitz constant for linear operators right
Sounds right
Oh, but lipschitz constant under what norm?
idk
intuitively it seems like if you fix a norm then the relation is linear though? not sure
For matrices there is a concept of equivalent norms
Essentially they are all the same up to constant factors
ic
If you want a reference, you can dig around in Demmel's Numerical Linear Algebra book\
One of the exercises is to prove the equivalence of a bunch of matrix norms
all norms r equivalent, so they must all converge if one converge, but how does that dictate that speed is same?
You can get a bunch of $C_1\norm{A}_q\leq\norm{A}_p\leq C_2\norm{A}_q$ for arbitrary norms
Angetenar
So yeah the speed cannot be too different
Actually I lied about the Lipschitz thing
Consider the matrix $C_x=\begin{bmatrix}0&x^{-1}\x&0\end{bmatrix}$ which has eigenvalues $\lambda=\pm1$ so its spectral radius is $1$. However, it's Lipschitz constant is $x$ or something, so they are definitely not the same
Angetenar
but i think there is a result that says that the spectral radius is the infimum of lipschitz constants (norms), or something like that
bruh ok my GS actually converges but its insanely crappy lol
My code, I'm still waiting for result
FAILED doesn't necessarily mean not converging, in seidel its usually cuz it convergence is way too crappy
Yeah it sucks that all the old people in academia use Matlab
Python is just so much objectively better
Oh my
First few trials, the matrix is pos def
Wait what if there's 0 entries in diagonal
don't you need to check and permute first before you split
or is the situation SOR or such is used always assuming the matrix is structure in such a way
What do you mean 0 entries in diagonal
cuz if A has 0 in diagonal,
All 0s?
any 0, then tril(A) is not invertible
ic, so it should be unlikely, and a valid assumption
that there's no 0's in diagonal
cuz ill never use these methods for sparse matrices or with special structure s.t. 0 is somehow likely to appear
I mean GS is only guaranteed to converge if the matrix is SDD or SPD
ah btw, so my question the other day was, so i think with GS, you can get it to converge even if spectral radius of the matrix let's call it B is bigger than 1, as long as say your initial value x0 is not very parallel with the big eigenvalue'd eigenvectors of B, is that right?
er, not the initial value x0, but the difference between it and the true value. at least assuming the eigenvectors of B span
GS can converge even if the spectral radius/condition number is larger than 1
Of course, it will not converge if the condition number is infinity because then the matrix is singular and you don't necessarily have solutions
And on computers it won't converge for condition numbers larger than 10^16
But spectral radius/condition number determine how fast it converges, not whether convergence can happen
oh, i thought it was both?
p(A) = 1 is the only undetermined i think
er, at least for spectral radius. spectral radius < 1 means it should always converge, ignoring FP nonsense, was my understanding
So when the spectral radius is less than 1 then yeah it makes sense that things would converge
Like you know the method for finding eigenvectors right?
Similar intuition I think
oh, what is the method for finding eigenvectors?
So there is an iterative method for finding the largest eigenvector of a matrix
The way you do this is you pick a random vector, and repeatedly compute x_n = A*x_n-1
This converges to the largest eigenvector
You'll also want to normalize these
This is called power iteration
ah, i think it sounds vaguely familiar
This can be generalized to find any eigenvector if you know the eigenvalue
You replace A with (A-lambda*I)^(-1)
So the eigenvalue corresponding to lambda becomes the largest
And then you get the corresponding eigenvector
Of course, this brings us to the problem of finding eigenvalues
Because calculating the characteristic polynomial of a matrix is very difficult and solving polynomials is also very hard
Wait for power it
you don't need to know eigenvalue though
like don't you just assume there's a eigenvalue that's strictly largest
and once you found eigenvector you can solve the largest eigenvalue?
ah ok
squirtlespoof
squirtlespoof
Compile Error! Click the
reaction for more information.
(You may edit your message to recompile.)
I also think that these letters are generally interesting
To gain insight into the practice of mathematics and how it is done
You don't really gain this insight when reading papers
Papers are clean and everything works out
But actually doing math is not
@lilac forge did you get a resolution to your question?
I may have time to take a look later today

With binary functions on n-bit strings, does balanced just mean linear?
I.e., the function $f:{0,1}^2 \to {0,1}$ given by $f(00) = f(11) = 0, f(01)=f(10)=1$ can be written $f(x_1x_0) = x_1 \oplus x_0$
Apopheniac
Given a function $f:\mathbb{Z}_2^n \to \mathbb{Z}_2$, is it true that $# f^{-1}(0) = # f^{-1}(1)$ if and only if $f$ can be written as a degree 1 polynomial in the variables $x_i, 1\leq i \leq n$?
Apopheniac
Balanced means 0 on half of the domain and 1 on half of the domain
Yeppers
I'm pretty sure you can make non-linear balanced functions
Okay, I see now how to make one:
$f(x_3x_2x_1x_0) := x_3x_2 \oplus x_1x_0 \oplus x_3x_2x_1x_0$.
Now to make sure it's not secretly linear...
Apopheniac
Thanks, this in particular clears it up as well:
Apparently a lot of nonlinear balanced! I just hadn't gotten up to a nontrivial amount of variables when I first asked

I see how this goes with that Prop 1 above. If you think in terms of a truth table along with a column for f(x), adding by the new variable x_n has the effect of doubling the truth table's size, and copying over the column of f(x), then below it you'd have f(x)+1. So if you want balance, take what you got and double it.
Hi
im having hard time understanding/implementing the newton raphson method in a project i have i was able to find my x fonction but im not sure if its the right x as you can see in the problem x is defined as x=h/b but also H expressed in a ratio of the dimension 2b which confuses me
i was able to find my fonction
Do or don't
understand*
Ok
don't *
Ok
The goal is to find a solution x for f(x)=0
To do this, we start with an initial guess x_0
We iterate on this
With the iteration defined by $x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}$
Angetenar
the itteration is adding one each time right?
What do you mean by adding?
#include <math.h>
#include<stdio.h>
const double Epsilon = 0.001;
//fonction is (x^3 * d2) − (3*x^2*d2) + 4*d1
double fonction(double x, double d1, double d2)
{
return x * x * x * d2 - 3 * x * x * d2 + 4 * d1;
}
double fonctionDeriv(double x, double d1, double d2)
{
return 3 * x * x * d2 - 6 * x * d2;
}
double zeroOfFonction(double x, double d1, double d2)
{
int i=0;
double rapportF = fonction(x, d1, d2) / fonctionDeriv(x, d1, d2);
while (fabs(rapportF) >= Epsilon && i<100)
{
rapportF = fonction(x, d1, d2) / fonctionDeriv(x, d1, d2);
// x(i+1) = x(i) - f(x) / f'(x)
x = x - rapportF;
i++;
}
return x/2
ok
Well then h=1/2 * x * (2b)
you lost me there haha
why 1/2?
it is
1/2(x)(2b)=x*b
got it
the x_0 how to find it then? so far ive been doing trial and error which is never a good sign
I see
Picking x_0 is actually a very hard question
In general
In this case you can just pick like x_0=0 or something and hope it converges
it rings a bell im going to look it up
Ok so if you pick some values you can find an interval where the zeroes are
Then you can just pick a point in that interval
okay ill try this
Just a random thought, could you improve monte carlo integration by using some distribution besides the uniform?
Like would there be a case where it is quicker to use a normal distribution than a uniform
doesn't make sense to me
imagine using it to integrate a simple shape, it would give you a weird weighted integral
I think I have certainly heard of integrating f(x) by writing it as f(x)/p(x) * p(x) and sampling from p
I forget the benefits, though
I think it had something to do with low probability regions (well, this doesn't make sense in this context, with a uniform distribution, i think this explanation was for a slightly different scenario)
You probably won't be able to integrate it, but you can definitely pick up other features of the function
This is closer to signals but there are all sorts of filters that people use to pick up and ignore certain properties of functions
Check out factor perhaps?
Does specifying the ring you're factoring over not help?
What is the fundamental issue?
Specify your polynomial ring as R[x,y]
Wait aren't you just trying to simplify this expression?
Does expand + simplify not do this?

maybe not what you want, but you could make a function that differentiates wrt W_s and W_t n and m times respectively and divides by n!m! and then plugs in W_s=W_t=0 to get the coefficients

then just call it like "get_mn_coeff(f)"
haha
yeah idk anything about sage
or maybe try like another package, maybe sympy or numpy has that capability
sympy probably does
What do you mean by temporarily print
Like in regular python
If you have a loop
Inside the loop, you can check if certain variables equal certain values
Then print that expression in that loop iteration
?
Ok you can define a function of x and evaluate it at x=0
Lmfao
Yeah just np.arange and exponentiate
yeah so basically |x|_inf is illegal 
So usually you have B(x)=0 if in constraint region and B(x)=1 or something if outside
Yeah
and then it runs over (|x|_inf, inf)
Yeah
when I saw the graph and what was mentioned for intended use, made me think particle in a box lol
Yes that is accurate
Another common type of barrier is the log barrier
So in one dimension you have B(x,b)=-log(x-b)
Which also goes to infinity
right, I saw this get compared to uh
It's also smooth away from b
There are lots of types of barriers
Yeah
You will want barrier functions to go to infinity so that they actually impose a barrier
No I haven't
I guess this prof be teaching all of us 
I'm not at FAU, just some random slide I found on google images of log-barrier compared to other norms
but yes I should be sleeping regardless, I've been up like ~50hrs+ ?
*56 I think


me this entire month

idk if this is even the right channel
my professor posted this, trying to interpret it
ah floating point, good stuff
Floating point numbers are fundamentally discrete
is this implying that each fan has an equal amount of numbers in it, and they all get stored as the same thing?
what are the lines/hard dashes at the bottom
Each floating point number as an uncountable set of real numbers that get approximated to it
this said The lines in each fan represent a storable number, and there are the same number of lines in each fan (power of 2)
i guess im confused about why each line, which is a floating point number (?) then goes into a single line with a bunch of other numbers
other FP numbers*
A computer cannot store 1/3rd as a floating point number because it has finite precision
Everything is approximated by a floating point with finite precision
but each line is already a floating point number
one way you can think about it is that the "gap" between representable floating point numbers increases as the magnitude increases
There are numbers that are approximated equally well between two floating point numbers
Which one gets chosen is a matter of convention
idk i might not be asking correctly
I'm not entirely sure what your question is
the dotted lines represent floating point numbers
so what are these
why are a bunch of FP numbers going into them
sorry if you answered it and i dont understand
Those are finite precision floating point numbers
I think this is saying all the real numbers above
Get approximated by those floating point numbers below
So if a calculation would’ve resulted in a real number, it just pretends it’s the floating point associated to it below
finite precision floating point numbers 
But I’m just a Chmonkey
I get that part chmonkey
😄 fortunately
so the red circled things are like
decimal numbers
fixed decimals
Base 2 finite decimals
how is that different from a floating point
Those are floating points
or i guess specifically
Do you know what a floating point number is?
why do floating points get scrunched together into one of those
The stuff above flowing into them are real numbers
Each one is a single floating point number
The things getting shoved into each one are real numbers
oh
That's why it says real numbers on top
there are the same number of real numbers that can be stored in each floating point number

that doesnt make sense
Yes, uncountably many
I mean it’s all infinite
yea
the teacher implies its finite
"The lines in each fan represent a storable number, and there are the same number of lines in each fan (power of 2)"
I mean a floating point number does not "store" a real number
It approximates a real number
same number of lines in each fan
Lol
so each floating point number represents a finite set of real numbers 🤷
That isn’t right
The distances between floating point numbers are always powers of two
That would require infinitely many floating points aka infinite memory aka sadge
idk how do you interpret the lines
and specifically each fan has the same number of lines
In computing, floating-point arithmetic (FP) is arithmetic using formulaic representation of real numbers as an approximation to support a trade-off between range and precision. For this reason, floating-point computation is often used in systems with very small and very large real numbers that require fast processing times. In general, a floati...
maybe each one of those is like
Just read this
a change in a decimal point
Bad picture
yea i think i need to research more then
Stop looking at it
Hey guys, I am trying to find a sequence.
What describes this?
How many bit strings of length k have the form
(0...01....1), where all the 1s are on the right side?
Basically 0* 1* where * is the kleene star
I've thought it through for a second, but like
I haven't made much progress
The sequence so far I have found by hand is
2,3,4,5...
k=1 corresponds to 2
I guess I'm not sure what your question is, the sequence is apparently k+1
There can either be no 1, or the first 1 can appear at any of the k places
dang you right
hello, im doing a research on variable relationship to find final value bound, but stuck on how to define relationship in general that works most of the time
currently, the operator allowed is +-*/ and modulus (mod)
the program would receive a set of variable and their range, such as x = { min: 10, max: 99 } and y = { min: 1, max: 9 }, and condition, such as x % y = 0, and x > y
the output should define each variable new range, the variable priority, and equation/formula to get the valid value, such as x = { formula: y * n, subnumber: { n: { min: ceil(10/y) , max: floor(99/y) } }, priority: 2 } and y = { min: 1, max: 9, priority: 1 }
another function would take this output and generate random value within given conditions, so the outout should have specific formatting which would be easier to process
does anyone know how to approach this problem? or have a better suggestion on how to form variable relationships such that it can be used to generate random value within conditions?
Ummmm
Ok let's say you have x in range(1,100) and y in range (1,10) and the condition that x % y = 0 and x > y
What new range and variable and equation are you defining?
Give me an example
The new range or equation/formula should defined such that it could produce all possible value
In your example, the result would be:
y = range(1, 10), should be processed/generated first
x = y * n, with n = range(a, b)
where a = ceil((x_min_new) / y) = ceil((y + 1) / y) = 2, as x must be more than y, so y + 1 would be the new minimum x value
b = floor (100 / y)
This output will be used by a generator, which would:
- process y first as it has higher priority, and let say generated y = 3
- process x, it process n range first, which would be n = range(2, floor(100/3)) = range(2, 33), where generated n value = 12
- x = y * n = 3 * 12 = 36
The generated value should satisfied all range and conditions given at the start
Oh I see
You are given a number of variables with feasible ranges and relationships that they need to satisfy and you want to find values that work right?
Solving any system of equations in full generality is impossible I think
Fortunately for you, because you have feasible ranges, you can always brute force this
So that's a worse case scenario
Depending on your specific conditions, you wind up with any number of different theories of solving equations
For example, this completely subsumes the fields of linear algebra/linear programming/integer programming
Finding all possible values will be a matter of brute force search I think it is safe to say
Yeah I know, but if it could at least define which number should be generated first, and at least some of them could be converted to formula, it would reduce the brute force amount
I mean you don't even know if you have solutions right
And I'm trying to do that, especially like x % y = 0 part, random x will yield impossible value sometimes
I mean yeah you can use these to refine your brute force search
But this will be a lot of coding cases by hand
Yeah I understand that
Fundamentally this boils down to the fact that there is no general theory for solving equations
There is especially not a general theory for solving arbitrary equations over integers
Integer programming is NP-complete. In particular, the special case of 0-1 integer linear programming, in which unknowns are binary, and only the restrictions must be satisfied, is one of Karp's 21 NP-complete problems.
Welp
If it's not constrained by integer, would the solution be feasible?
Well if isn't constrained to integers then % constraints don't really make sense but ignoring that, you might have a better chance
You can recast solving this as an optimization problem and then optimize
Of course for the integer case you can always brute force but depending on the number of variables this may not be practical
there is another weird part, should the values generated be uniformly distributed among possibilities in the originally given conditoins? because then this is another thing that must be kept in mind in coming up with a value generation process
No, as long as there's method to generate value out, it is enough
If you're finding all the possibilities then the distribution doesn't matter right
ah, i thought in the example he gave it was something like: produce y from a uniform distribution, produce n from a uniform distribution, compute x
It just happened to be uniform distributed as it's random generated within a range
Find a formula of the form $$\int_0^1 xf(x)dx = \sum_{i=0}^n A_if(x_i)$$ with $n=1$, that is exact for all polynomials of degree 3.
Trichloromethane
I try to use the method of undetermined coefficients where I basically just set my f(x) to be 1, x, x^2, and x^3 and then get a system of equations
but the weights and nodes I get are whack
never mind i had one of my coefficients wrong 

Find the formula: $$\int_0^1 f(x)dx = A_0 f(0) + A_1 f(1)$$ for functions $f(x) = ae^x + b\cos(x \pi/2)$
Trichloromethane
Ok
I know that you just need to evaluate the int of f and then solve for A0 and A1
First we have $f(0)=a+b$ and $f(1)=ae$ right?
Angetenar
yeah
Ok
And the integral is $ae - (a + 2b/\pi)$
Trichloromethane
Ok
Wait actually I think you have a sign error
I'm getting that $\int_0^1f(x)dx=ae-a+\frac{2b}{\pi}$
Angetenar
woops
Angetenar
So $A_0(a+b)+A_1(ae)=ae-a+\frac{2b}{\pi}$
Angetenar
Now we see that $A_0b=\frac{2b}{\pi}$ so $A_0=\frac{2}{\pi}$
Angetenar
ahhh, it was that sign mistake, i stg these trivial errors
yeah my algebra is just dog
Which then yields me:
Don't you need to divide by dx^2?
I write this as:
[
\frac{d^2 \hat{u}(t)}{dt^2}=c^2 A\hat{u}(t)
]
if dx=L/N then N^2 should be ok
Victor H
Ok
where A is:
Yes
Now
How the heck do I solve this with symplectic Euler
The symplectic Euler is:
Where H is the Hamiltonian and the Hamiltonian for the wave equation is:
To make it easier I did u=q and du/dt=p
And I get this:
Notice that \ddot{q}(t)=\dot{p}=c^2Aq(t)
And that means that:
And with the subsitution u=q we get back our initial
So it's the same system just written differently
I don't understnad how I'm supposed to iterate forwards with the symplectic method
Ok
With the symplectic euler method you are updating q and p
You don't work with any second derivatives
So you would be forward propagating p and q
Which is why I don't think it's the best method to use here
It's the assignment hehe so not much option
My problem is I don't know when/how to step time and/or spatial
You don't step in time and space
You instead step in p and q
From p and q you can calculate u(x,t)
But how do I step in p and q?
Do you know what a Hamiltonian system is?
Not more than what was given for this assignment; "A system is a Hamiltonian with the Hamiltonian H if the system can be written:
Ok
(u=q and du/dt=p)
How does the symplectic Euler method update?
Ok
Can you plug in the expressions you found for nabla_p and nabla_q?
Then discretize
[
\begin{split}
q^{n+1}&=q^{n}+hp^{n+1}\
p^{n+1}&=p^{n}-hc^2Aq^{n}
\end{split}
]
Victor H
Like so?
Yes looks right
So I'm guessing I first solve p^n+1 cuz I need it for q^n+1
Yes
Don't you have an initial condition?
You'll want to incorporate the boundary conditions as well
How?
Not entirely sure
Let's just say it's 0
The u in the original problem
I'm sorry I don't understand your question
You're trying to initialize p and q right?
Yeah
So you initialize u and use that to initialize p and q
Yes but it breaks with initial condition p=0 lol
because q also has initial condition 0
so it will never step forward
Well yeah
You don't have an initial condition specified
So I don't know how you're supposed to solve the problem
ye
its obvious theres a typo actually
The x3 and x4 are meaningless
wait G only needs to be C^1
if you want linear convergence
right, if |jacobian| < 1
G(x) = x-c*F(x) fixed point

like G doesnt need to be C2
cuz in the notes i think they said G needs to be C2 but i think not
wait fuk it needs to be
cuz error analysis the remainder needs to be bounded
If I have a function G:R^n->R^n and G(a) = a, and G is C^2
spectral radius of jacobian of G is 0, is it necessarily quad convergent?
cuz I thought this does not imply jacobian of G is 0
but the linear conv ratecan be made arbitrary small
I was under the impression that 0 spectral radius meant that the Jacobian was non-invertible and thus you could not have quadratic convergence
In fact, a spectral radius of 0 means that every eigenvalue of the Jacobian is 0
So the Jacobian is the 0 matrix
yea i agree with this
but
the jacobian is not necessarily 0 is what i mean
but then by taylor expansion the linear convergence rate can be made arbitrarily small because spectral radius is pretty much the lipschitz constant
though I'm guessing the question is that does that imply quadratic, or some unknown convergence rate s>1
this argument should work right
Seems fine
Hey guys, I'm solving the wave equation:
And I solve with symplectic Euler numerically
When I do g(x)=sin(3pi*x) I get this:
Which looks correct, right?
But when I try to do something more exotic with a square wave for example I get this
Why does it shake so violently and split off in like two directions
Look when I start the square wave in the middle
????
What is it doing man
I'm not familiar with symplectic methods, but have you tried decreasing time step and grid size?
If you refine it a bit more, the discontinuities will become smaller, yeah
This is a bit related to Gibbs' phenomenon
Not from a mathematical perspective
But from a moral perspective
Oh if it's similar to Gibbs in character you can't get rid of it entirely. But it should decrease with finer time step and grid size.
Anyways, these oscillatory issues will motivate higher order methods and methods based on characteristics
BUt why is it going in different directions in the beginning??
In the wave equation, initial data propagates to the left and to the right equally
If you did this for something like a linear advection equation, it would only move in one direction
Yes if it starts in the middle it goes both directions think of it like a guitar string plucked up at a point. It will fall down sending the wave in both directions.
Oh yeah that makes sense

Do something like u_t+a*u_x=0
Oh I change the initial wave diff.eq
Yeah
This two-sided propagation is a fundamental fact about the second order wave equation u_tt=u_xx
Yup this is advection which moves initial conditions at velocity a. Is there a word for how wave equation transports initial conditions?
Yes I know this is advection
You just say that the wave equation transports initial conditions along two characteristics
So you have a characteristic cone
The cone part is more important for the non-homogeneous wave equation though
Got it so that's what wave propagation is the transport of initial conditions along two characteristics
What would that be for convection? Would it not work because characteristics is limited to linear PDEs?
No, you can have characteristics for non-linear pdes
They just aren't lines anymore, but curves instead
However, not every pde has characteristics
Parabolic pdes such as the heat equation display infinite speed of initial data propagation
OK got it
Is this enforced by a lagrange multiplier? Like in the case of incompressible Navier Stokes my teacher told us pressure is a lagrange multiplier which enforces mass conservation exactly. So whatever needs to happen for mass conservation to hold propagates instantaneously across the flow. Does temperature for the heat equation serve that purpose?
How the heck do I change it to these these boundary conditions?
I feel like my entire numerical solution can't handle this
Ummmm
I am not too sure about the interpretation of pressure as a Lagrange multiplier
No problem, I'll ask my teacher about that
When you go through the derivative of Navier-Stokes you just enforce conversation of mass with the mass continuity equation $\pdv{\rho}{t}+\nabla\cdot(\rho u)=0$
Angetenar
Lagrange multipliers never show up
@neon snow how did you implement your previous boundary conditions?
I see
I don't know if this gives you anything but:
for m = 1:M
udot(:,m+1) = udot(:,m) + dt * c^2 * A*u(2:end-1,m);
u(2:end-1,m+1) = u(2:end-1,m) + dt * udot(:,m+1);
end
Do you see the u(2:end-1,...)
u_j(t) is each j:th point of the x-intervall
Inside the boundary
so j_0(t) and j_N(t) is the boundary
Uh huh ok
So instead of just saying what u is at the endpoints, you specify what the derivative is
You will want to discretize the derivative in x
So
Then you can get a system of equations to solve at the endpoint
