#advanced-probability
1 messages · Page 39 of 1
Ah the two examples i had in mind were unbiased lol, thanks
@remote rune what about the uniform distribution
Well if it is on [0, theta] the mom for theta is unbiased right since its like 2 times the mean
Ah yeah why didnt i think of that lol thanks
If I have 2 covariance matrices A and B of n jointly gaussian distributions each. Does there exist some result where if the diagonal entries of A are greater or equal to the ones of B and the off diagonal entries of A are smaller in absolute value than the ones of B.
Then the max of the n gaussians represented by A is larger in expectation than the max of the other gaussians.
My kind of intuition for that is, the more independent the gaussians are, the larger is the max in expectation. I'm looking for such a result.
try Sudakov--Fernique
@grave oasis
maybe some care with the signs of the covarian.ces
thanks a lot. This seems to be exactly what I was looking for. However, sth feels off with that result. I'm not sure where I'm going wrong with this, but if you look at the covariance matrix of 2 independent standard gaussians (identity matrix) and the covariance matrix of a standard gaussian and its negative (so ones on diagonal, -1 off diagonal), then the independent gaussians fulfil the role of X and the coupled ones as Y. The inequality should hold, however that would mean that the expected max of the independent gaussians is smaller, which shouldn't be the case should it?
ok maybe for the case I just stated it still works if I didn't mess up my calculations
so I guess my intuition is wrong for some cases, which are better captured by this inequality
Any good texts on semimartingale theory? Ive been using Eberlein and Kallsen, but im not sure if its the best text
Thank you
is there like a "brownian motion" for continuous functions? like, suppose $f_t \colon [0,1] \to \bR$ is a family of functions such that
-
$f_0 \equiv 0$
-
$\int (f_t - f_s) \sim N(0,t-s)$
-
the $f_t$ are continuous functions, and increment independently
Average J∘du=du∘j enjoyer
is this a thing?
Need more information on the integral (set you are integrating over, is it a stochastic integral, is it a Lebesgue integral, etc)
Given that this question is a "is this a thing" question, I'm guessing it doesn't particularly matter.
But if I were to formalize this, I'd say this seems like you're asking about a random walk in an L1-space.
my bad, I meant a lebesgue integral
probably f_{t-s} ~ f_t - f_s ?
that it's a continuous map? idk
maybe I should have prefaced this by saying I don't know what I'm talking about (which is probably obvious by now)
ty this is helpful
kind of related: the structure of martingales on a banach space is used to study the banach space's geometry
oh, that sounds neat, do you have an example?
thanks!
Question related to lemma 1.42 in Koralov & Sinai
Hi this might be a naive question and I hope it is relevant to this channel as it falls in the domain of financial mathematics.
Why in most time series models like ARCH,GARCH etc we assume the stationarity (in weak sense) of the series? Like does it have any physical intuition?
For example in the Black Scholes Model we assume that the log returns of stock prices to be normally distributed where the stock price follows the Geometric Brownian Motion process. The log normality of returns in this case as I have read is due to the efficient market hypothesis and an application of central limit theorem to the returns.
I was wondering is there any similar line of reasoning for the assumption of stationarity of time series in ARCH/GARCH models.
stationarity has the implication that the parameters (mean, variance, etc) don't change over time
in essence it's statistical properties are constant
thanks
Hi, does anyone have a clue how to solve this?
Define the variation distance between two probability measures on say, a LCTVS by
$$\abs{\abs{\nu-\mu}} = \sup_{A}(\abs{\nu(A)-\mu(A)} + \abs{\nu(X-A) - \mu(X-A)})$$
ShiN
Then two measures are mutually singular iff their variation distance is 2
can someone help me with this? I'm not sure how to approach this (haven't done any probability in a bit)
intuitively the statement is very clear. I'm specifically interested in distance 2 implying mutually singular
i'm not sure how to procure the required set from the supremum
I have question regarding couses in my university
one is called 'stochastic processes'
the other is called 'advanced probability 1'
durrett's essentials of stochastic processes
the latter uses billingsley
but by looking at table of contents
I cant really seem to sea differences
But if I were to formalize this, I'd say this seems like you're asking about a random walk in an L1-space.
Beyond what the first answer on that MO post says (written by a guy who specializes in this stuff). BM on a Riemannian manifold is characterized by an extension of Levy's characterization to tangent spaces: $\forall f\in C^\infty:\mathrm{d}\langle f(W)\rangle_t=\lVert\nabla f(X)\rVert^2,\mathrm{d}t$.
teafortwo
for $|\vert\nu-\mu\vert|=2\implies\nu\perp\mu$: under what circumstances does $|\vert\nu-\mu\vert|=2$? What does $|\nu(A)-\mu(A)|$ have to be? What does $|\nu(A^c)-\mu(A^c)|$ have to be? (given the measure of the whole space is one)
VMM
It's a supremum though, You don't know that there's a single set that realises the supremum
That's my point
Obviously if you could find such a set that would ahow they're mutually singular
My problem is the weight can bounce around as you approach the supremum
From measure to measure
It's definitely old
And doesn't cover the standard approach to probability iirc (with Lebesgue integral)
the first part is mostly about discrete probability iirc, and the second one is about continuous random variables, although like I said, the approach is a bit outdated, and in second part it's especially visible
nonetheless, because of how this book is written, it's still considered a worth experience to read it
at least the first volume
I didn't read a lot of it, but my opinion is based on other people opinions about it
For the setup that you've already done I think, take sets A_n approaching the sup of 2. Since both quantities |nu(A) - mu(A)| and |nu(X-A) - mu(X-A)| are bounded by 1, |nu(A_n) - mu(A_n)| and |nu(X-A_n) - mu(X - A_n)| both must approach 1. Now, max(nu(A_n), mu(A_n)) approaches 1 and min(nu(A_n), mu(A_n)) approaches 0 (I think this is what you're worried about in terms of the measure bouncing around).
I'm assuming you don't want the answer immediately so I'm going to give you 3 vague steps to fix this and put in spoilers the specific thing that you should be doing
- Pass to a subsequence of A_ns ||so that the max is always attained by nu and the min is attained by mu, wlog||
- Pass to a further subsequence of A_ns ||so that the sum of the mu(A_n)'s is finite||
- Pick your set to be ||the limsup of the A_ns||. this works because ||the nu(A_ns) go to 1|| and ||borel cantelli says mu(limsup A_n) = 0||.
Sorry, I should have seen this 2 days ago
Thanks!
I already finished the talk 
That this was a part o
Of
But I'll think about this
I see
Does anyone know if the cameron-martin space of a gaussian measure defined on a LCTVS on the sigma algebra making all continuous functionals measurable is always measurable?
The cameron-martin space being
The space of all elements $h$ where the supremum
$$sup{f(h): f\in X^{\ast}, , \operatorname{Cov}(f,f)\leq 1}$$
Is finite.
Where we are assuming functionals are $L^2$. Equivalently it is the space of all elements h such that the translation by h measure is equivalent to the original measure
ShiN
Maybe someone has seen this in the context of cameron martin theorem
Real vector space
I think you might need to take completion
In general
hi guys
I have a question related to probability problem with Markov's chains.
I have a state matrix of first and second degree. If i know the state in which the object is the most probably in after 2 steps, how can I find the "starting state" the object started from?
What
The sup is the Cameron martin norm of a single element h
It's not the def of the space
The space is all those elements eith finite norm
So I don't understand what you mean
are densities always measurable maps?
the RN process is measurable
what is the RN process?
ah ok
but the actual density need not be measurable?
wrt B(R) and whatever sigma algebra’s in the domain
The density is measurable
after all how do you define the lebesgue integral without measurability
@keen loom thank you
You're welcome
Hello
I got this question
is my answer right?
Did you find any mistakes there? thank you
Is it true that $e^{X_{t}}$ is markovian if $X_{t}$ is levy?
VMM
question
for part A is solving it that different that solving one with no drift
ignore part B i think ik how to do that
i mean do u have to go about it differently solving it with no drift then solving it with drift and is so, how
because solving it with drift is very easy
then set the drift to zero
hi, I understand why the conjugate prior of a gamma(a,b) [unknown b] is a gamma(a0 + na, b0 + sum(x)). but if I
*I'm told that b = 1/theta, how does that fit in?
nevermind, I think inverse gamma for the prior then
anyone got a good source on how strong the spectral norm of a symmetric matrix with i.i.d standard gaussian entries concentrates around its expectation.
Well I'm really just looking for user friendly upper bounds
I know there is this, if you give the diagonal entries a variance of 2 instead of 1. So I assume for my question this should be pretty similar apart from some constant factors somewhere
ah wait I think I can use this
sigma = 1 should work, edit: no it doesn't
Yes exactly since the norm is 1 lipschitz
although actually there is probably a sqrt(2) in there because of the symmetric requirement
edit: insert random rambling
but maybe I'm overlooking something
no, you're right, the norm irritated me nevermind
the sqrt(2) is needed
I somehow think of spectral norms every time I see two lines now, even though it's clearly the euclidean one here
Hi, this may or may not be an advanced question, but I am in an advanced math class just am struggling with a basic concept, if I have a piecewise cdf, and I want to try to generate a pdf from that cdf using the fundamental theorem of calculus, if y=x^2 do I need to consider any piecewise which is defined on x < 0?
generally you need to take the derivative everywhere yes, (well, everywhere where it is possible, it jumps at points where it is not differentiable) but y=x^2 is a bit of a weird choice since you have to cut it off at y=1 for it to become a cdf
"we immediately see", well I certainly don't
does anyone know why this is considered obvious
Maybe this is a dumb question, but what does it mean for Z = N(0,\sigma^2) here? I assume that is not a typo based on the pseudocode.
Hello. If X is a stochastic process and T a stopping time, what does the notation $X_T$ mean? Is it maybe the random variable defined by $X_T(\omega) := X_{T(\omega)}(\omega)$?
Gewisser Fler
That is how I would interpret that notation without any other context. Can't say I love the choice of T as a stopping time though
do you have more specific info on J?
The way I like to think of it is the limit as $t\to\infty$ of $X_{t\wedge{T}}$
VMM
I actually found a more thorough explanation
No, this means that the w0 is a vector that was sampled from a gaussian distribution with independent entries and variance sigma
What are the probabilities associated with J?
this shouldn't be relevant according to lemma 2.3, I would like to post it, but that's like 2 pages of derivations
You said it was 1, -1 or 0? Does it require 2 pages of derivations to say J takes the value 1 with probability blah?
This is a more thorough explanation of how this algorithm is supposed to work
it's not gonna help you much, but I'm just gonna send it anyways
I can mostly see all the pieces with the additional context, but some bit of linear algebra is rusty enough that I can't tie it all together
By substitution you see that essentially $J_{x^{\prime}}^{\sigma^{\prime}} = J_{w_{i-1}}^{\sigma}$. It's not quite an equality, which is why you correct for it by moving the J units in the direction $v_i$. The choice of the $\sigma^{\prime}$ is exactly what you need so that in the exponential, there is a change of variables $j \mapsto j \lVert v_i \rVert$. I feel like there is some trivial linear algebra I have just forgotten to put it all together.
Hexicle
@sterile sail this might make things more clear for you, just replace the sigma_t there by the J (I know inconvenient notation, but they are 2 different papers
I probably need a source on how gaussian vectors behave under projection, addition with dependence/independence etc. that should clear up a lot of the confusion
does anyone know of a book or lecture notes on pde aimed at probabilists? i have no pde background and i want to learn spde, so im looking for a crash course on the minimal pde to get started with that
more specifically i want to read hairer’s rough paths book and his various lecture notes so that eventually i can study some of the more analysis-heavy papers on solving kpz equation and other important spdes.
i truly know no pde beyond a couple weeks in a diffeq class for engineers though haha
If you have a good background in real analysis use evans
im looking for something a bit more succinct than evans, if it exists
i assume the entire book is not needed for what i’m aiming to learn?
but yes i have plenty of background in analysis, i just never learned pde
ah i never googled “pde for probabilists” until just now, but apparently there is this book https://www.cambridge.org/core/books/partial-differential-equations-for-probabilists/30BB576097CF5ECF7914FDBCAF69E1F2
has anyone here read it
I don't know anything about SPDEs, but iirc it is very technical and pde heavy, so you probably want to know PDEs very well (maybe everything in Evans is enough). Idk exactly though
ok, but analysis-heavy does not imply pde-heavy
my impression that reading evans cover to cover is way overkill for getting started with the spde resources i mentioned, i just can’t really tell what can be blackboxed and what is too important to skip
does anyone here know about spde & can answer?
Is there some way to obtain a probability distribution function from partial derivatives of the cumulative distribution function like the 1-dimensional case?
@keen loom $f(x, y) = F_{xy}(x, y)$
IlIIllIIIlllIIIIllll
really? That's unexpected for me
$F(x, y) = \int_{-\infty}^{x}\int_{-\infty}^{y}f(a, b),db,da$
IlIIllIIIlllIIIIllll
$\implies F_x(x, y) = \int_{-\infty}^{y}f(x, b),db$.
IlIIllIIIlllIIIIllll
this means that if $(X, Y)$ has a continuous density $f$, then $F_{xy} = f$.
IlIIllIIIlllIIIIllll
Conversely, if $F$ is $C^2$, then we can work backward and obtain that $F_{xy}$ is the density
IlIIllIIIlllIIIIllll
makes sense
The longer I look at this, the more I feel like there is a mistake in there, the x' should be divided once more by $||v_i||_2$, otherwise there is some variance problem
Dr. J. Stockfish
DvaNapasa
Is there a theory of non-homogenous Markov chains?
People assume that they are homogenous, but it feels like what's really the interesting part is the non-homogenous ones
Check out reinforcement learning
As your policy improves, the transition probabilities of the environment will change over time
And Transition probabilities changing means non-homogenous markov chain
Can the binomial distribution approximate the guassian?
Can a uniform distribution have any bounds that we like?
the convolution of binomials is asymptotically gaussian
any bounds less than infinity/more than negative infinity
in fact one can prove that there is no 'uniform' finite countably additive measure on all of the reals
Easiest way to find the marginal cdf given a joint cdf?
I think #probability-statistics is a better place to ask
let the random-variable you don't care about be less than or equal to infinity and keep the one you care about less than whatever value you are focusing on
E.g. For joint CDF f(x, y) over random variables X and Y, marginal CDF of X is f(x, ∞)
Im currently learning for my exam and i want to solve this exercise:
This exercise is from an older exam so it should be relatively easy to solve, however i dont know how
First off, we dont really know what the distribution of Z is (i looked it up, its called gamma distribution, which we dotn know yet), second of, how would the joint density of X and Z then look like? It sounds like a lot of work to compute it, however like i said, this exercise shouldnt take too long and esp no long calculations
I was thinking that maybe X has to be independent of Z, then things would be easier
However that doesnt seem to be true as well
What can i try here?
E[X|Z] = E[Y|Z] so E[X|Z] = (X+Y)/2
no need for any calculations
Ah yes that makes sense
If you compute E[X^2|Z] then you can similarly get E[XY|Z] = (X+Y)^2 /2 - (E[X^2 | Z] + E[Y^2 | Z])/2
Adding to Blitz's answer you can also easily obtain the conditional density of X given Z by setting it to (E(X|Z)f_X)/E(X)
Ahh okay i now got E[XY|Z] = E[X^2|Z] = 1/4 * (X+Y)^2
Let me think about the density for a bit
Ahh yes now i understand
Thanks a lot guys
Hey, i was thinking about the formula you gave, which is really useful since we dont need the joint density function at all, however i have never seen this formula before, intuitively it looks correct however im unable to prove that its correct, do you know a book where it is explained more?
I have a problem trying to prove that $$P(X_{k_n} = m_n | X_{k_{n-1}} = m_{n-1}, ..., X_{k_0} = m_0) =$$ $$ P(X_{k_n} = m_n | X_{k_{n-1}} = m_{n-1})$$ for a Markov chain $X_k$
Blitz
this is the assumption of markov chain
no
Intuitively it’s true
It’s saying that probability of next state depends only on previous state and not the history of states
Proof would be more challenging though I see what Kong is saying about it being the assumption
This is how markov chains are introduced in books
This thing is called as "Markov property", and one way to define Markov chain is "a sequence of
random variables that has the “Markov property”", see http://www.stat.yale.edu/~pollard/Courses/251.spring2013/Handouts/Chang-MarkovChains.pdf
No. This is pretty much a corollary of exercise 1.5b of what you linked
You can call it Markov property, sure, but not even what you cite calls this Markov property
It does
No, this is a different condition
anyway, exercise 1.5a helped a lot, thank you
emm, if you need help to prove this, then you need to give your definition of markov chain first
not anymore
we have function f in L^1 definied on domain [0,1). Is it true that we can find continuous function g defined on [0,1] such that f -g arbitrarily small ?
I think so. try choosing a simple function $s:[0,1]\to \mathbb R$ such that $|f-s|{L^1} < \varepsilon/2$, and then using appropriate bump functions, make a continuous function $g$ such that $|g-s|{L^1} < \varepsilon/2$.
Joseph
you can choose the simple function by splitting $f=f^+ - f^-$, and choosing simple functions $s^+, s^-$ such that $0\leq s^+ \leq f^+$ and $0\leq s^- \leq f^-$ that satisfy
$$\int_{[0,1]} f^+ - s^+ \mathrm{d} \mu < \varepsilon/4, \qquad \int_{[0,1]} f^- - s^- \mathrm{d} \mu < \varepsilon/4,$$
using the definition of integrability
Joseph
then, for $s=s^+ - s^-$, you get
$$|f-s|{L^1} = \int{[0,1]} |f-s| \mathrm{d} \mu$$
$$= \int_{[0,1]} |f^+ - s^+ - (f^- - s^-)| \mathrm{d} \mu$$
$$\leq \int_{[0,1]} f^+ - s^+ \mathrm{d} \mu + \int_{[0,1]} f^- - s^- \mathrm{d} \mu$$
$$< \varepsilon/2$$
Joseph
now, we want to choose the continuous approximation of s
write $s = \sum_{i=1}^n a_i \chi_{E_i}$. The crux is approximating the indicator functions to arbitrary precision. hmmm how do we do this
Joseph
ahh nice we can use regularity of Lebesgue measure and Urysohn's lemma
so the last thing that we need to accomplish is approximate an indicator function $\chi_E$ by a continuous function to arbitrary precision. Once we figure out how to do this, we can define our $g$ as a sum of indicator approximations with error less than $\varepsilon/2n|a_i|$ or something like that. So let's just abstract away the details and try to approximate $\chi_E$ by a continuous function with error less than $\varepsilon$
Joseph
Since $E$ is Lebesgue measurable, we can pick a closed set $F\subseteq E$ and an open set $G\supseteq E$ such that $\mu(G\setminus F) < \varepsilon$
Joseph
Read rules and don’t post same question in 3 channels
oh oops did they?
oh lol
Thank you for help
I'm looking for a little statistics related assistance. For brevity, I already have help-8 open where I've been working on providing any necessary background for those who may need to understand the notations or see an MWE or source file. Would be very much appreciated if someone could take a look at that room and see if it'd be possible to assist me in deciphering the conditional formulaic switch
I'm very confused about what exactly you need help with
Hi, thank you, could you jump over to help channel 8, I'll clarify there
ok
Anyone know how I’m supposed to do this
I forgot sums in the exponent in the last line
But idk where to go from here. This was on a qualifying exam and I would never have time to do all that algebra
And I don’t see anything simplifying
Work with the log likelihood
Does anyone know about families of rotationally invariant distributions on R^n, s.t. projections onto orthogonal subspaces of random vectors drawn from those distributions create independent random vectors.
The gaussian multinomal distribution for example belongs to that family as orthogonal components of gaussian vectors are independent.
However I'm looking for distributions with that property that possibly concentrate stronger around 0 than the gaussian distribution does
but I'm not sure if there are any
ok after looking a bit further I do in fact think the answer is probably no
Yeah, the result is called "Maxwell characterisation of Gaussian distributions"
We don't define recurrent states in a Markov chain in general, right. Just for homogeneous ones
yeah recurrent states don't apply to nonhomogeneous markov chains
same goes for transient states
thank you