#Help with MLE Estimators

635 messages · Page 1 of 1 (latest)

woeful dune
#

Frankly, I have no idea what I have to do here, and I am in the brink of tears. I just don't get anything.

heady quailBOT
#
  1. Ask your question and show the work you've done so far. If you've posted a screenshot of a question, specify which part you need help with.
  2. Wait patiently for a helper to come along.
  3. Once someone helps you, say thank you and close the thread with:
    +close
    
  4. Feel free to nominate the person for helper of the week in #helper-nominations
  5. Do not ping the mods, unless someone is breaking the rules.
  6. If you're happy with the help you got here, and the server overall, you can contribute financially as well:
woeful dune
#

I don't understand what the estimator of MLE in this case, is this just finding the common MLE for sigma? but what is r in this case?

broken plank
#

You suppose that there is a nearly linear relationship between them

#

So there is a slope r such that V = r X + epsilon, where epsilon ~ N(0, sigma²)

#

So here usually it is easier to provide an estimation for the slope rather than sigma²

#

This is just a linear regression case

#

You could think that conditionally to X, V follows a normal distribution N(rX, sigma²)

#

Hence the conditional log likelyhood of V is given by $\ln f_{V|X,r, \sigma^2}(v|x, r, \sigma^2) = \sum_{i=1}^n \frac{(v-rx_i)^2}{\sigma^2} - \frac{n}{2}(\ln \sigma^2 + \ln 2\pi)$

idle sierraBOT
broken plank
#

Where x denotes the vector of all observations (x1, ..., xn), which are independent

#

So now you have an optimization problem

#

With 2 variables

#

How do you solve optimization problems with 2 variables?

woeful dune
#

I just want to know how to do the things in that exam and thats it ig

broken plank
#

And the critical points

woeful dune
broken plank
#

You see that the function there is a function of r and sigma²

#

(instead of sigma² I will use v)

woeful dune
#

okay

broken plank
#

So say $g(r, v)$

idle sierraBOT
woeful dune
#

okay

broken plank
#

You compute $\nabla g (r, v)$ and equate it to 0

idle sierraBOT
broken plank
#

In other words, it's a system of equations, which are

#

$$\frac{\partial g}{\partial r} (r,v) = 0$$

$$\frac{\partial g}{\partial v} (r,v) = 0$$

idle sierraBOT
broken plank
#

2 equations, 2 unknowns r and v

#

your turn now

#

i need to eat

woeful dune
#

The first would be something like Summation 2xi (v-rxi) / sigma^2

The second would be something like Summation -2(v- rxi)/sigma^3 + n/sigma?

broken plank
#

almost correct

woeful dune
#

huh

#

let me double check second

broken plank
#

also since the computations are fairly complex, if you want me to check I ask you to write with tex

woeful dune
#

i dont know how to work the bot, let me see if i can figure it out

broken plank
#

it's fairly intuitive

#

take example from my texts

#

$\sum_{i=1}^{n} i = \frac{n(n+1)}{2}$

idle sierraBOT
broken plank
#

When you enclose your math text between dollar signs, tex will render and display your message

#

And the round d is given by the command \partial

#

$\frac{\partial g}{\partial r}$

idle sierraBOT
woeful dune
#

$\sum_{i=1}^n \frac{2xi(v-rxi)}{sigma^2}$

idle sierraBOT
#

Jacques

woeful dune
#

i didnt know how to do sigma

broken plank
#

Yeah, don't forget the subscripts

#

and for greek letters, add a backslash

#

$\sigma$

idle sierraBOT
woeful dune
#

gotcha

broken plank
#

$x_i$ too

idle sierraBOT
woeful dune
#

$\sum_{i=1}^n \frac{2x_i(v-rx_i)}{\sigma^2}$

idle sierraBOT
#

Jacques

broken plank
#

Also, I apologize

woeful dune
#

huh

#

what for

broken plank
#

I shouldn't have used v for the variance, it's a symbol already in use

#

I wanted to replace $\sigma^2$ with $v$ but $v$ is already in use

idle sierraBOT
woeful dune
#

yeah, got that, used t in my head

broken plank
#

Let's say instead that it's $\nu$

idle sierraBOT
broken plank
#

or t if you want

#

So again:

woeful dune
#

i think i found my mistake

#

let me see if i can write it out

#

second

#

$\sum_{i=1}^n \frac{-(v-rx_i)^2}{\sigma^4} + frac{-n}{2\sigma^2}$

idle sierraBOT
#

Jacques

woeful dune
#

oh shit i fucked the frac up

#

$\sum_{i=1}^n \frac{-(v-rx_i)^2}{\sigma^4} + \frac{-n}{2\sigma^2}$

idle sierraBOT
#

Jacques

woeful dune
#

there

broken plank
#

$g(r, \nu) = \sum_{i=1}^{n} \frac{(v_i - r x_i)^2}{\nu} - \frac{n}{2} (\ln(\nu) + \ln(2\pi))$

#

I think

idle sierraBOT
woeful dune
#

is my solution not correct?

broken plank
#

I just rewrote the original function to optimize for you

woeful dune
broken plank
broken plank
#

before frac

woeful dune
#

the partial derivative wrt to sigma^2

broken plank
#

Yeah, that's good, except for the fact that it's v_i instead of v

#

What about wrt r?

woeful dune
broken plank
woeful dune
#

$\sum_{i=1}^n \frac{2x_i(v-rx_i)}{\sigma^2}$

idle sierraBOT
#

Jacques

broken plank
#

Also you're missing a -

woeful dune
#

$\sum_{i=1}^n \frac{-2x_i(v-rx_i)}{\sigma^2}$

idle sierraBOT
#

Jacques

broken plank
#

v_i, not v

#

since it's a target per observation

#

one v_i per x_i

woeful dune
#

$\sum_{i=1}^n \frac{2x_i(v_i-rx_i)}{\sigma^2}$

idle sierraBOT
#

Jacques

woeful dune
#

shit yeah, you're right

#

mb

broken plank
#

Yeah, now the minus

woeful dune
#

$\sum_{i=1}^n \frac{-2x_i(v_i-rx_i)}{\sigma^2}$

idle sierraBOT
#

Jacques

woeful dune
#

$\sum_{i=1}^n \frac{-(v_i-rx_i)^2}{\sigma^4} + \frac{-n}{2\sigma^2}$

idle sierraBOT
#

Jacques

broken plank
#

To recap what you said:
$$\frac{\partial g}{\partial r} (r, \nu) = -\sum_{i=1}^n \frac{2x_i(v_i-rx_i)}{\nu} = 0$$

$$\frac{\partial g}{\partial \nu} (r, \nu) = \sum_{i=1}^n \frac{-(v_i-rx_i)^2}{\nu^2} + \frac{-n}{2\nu} = 0$$

woeful dune
#

yes

idle sierraBOT
broken plank
#

There we go

#

Now, 2 unknowns, 2 equations

#

You need to solve this

#

hint: start by finding the optimum r from the first equation

#

it should be fairly ok

#

then sub in the 2nd

woeful dune
#

okay, give me a moment, i've never solved equations with summation before

#

our math courses were seriously lacking, i am finding

#

for the first one, intuitively $r = \frac{-x_i}{v_i}$

#

ah i have to write the second one out, second

idle sierraBOT
#

Jacques

broken plank
woeful dune
#

damn it

broken plank
#

I intend to commit to helping you when I first answered this post

#

so it's fine

woeful dune
#

isn't that only 0 when $v_i - rx_i = 0$

idle sierraBOT
#

Jacques

broken plank
#

I will do it a demonstration for you with the first one

#

and you will try with the second one

woeful dune
#

okay thank you

broken plank
#

Here we just split and distribute the sum

#

$$\frac{\partial g}{\partial r} (r, \nu) = -\sum_{i=1}^n \frac{2x_i(v_i-rx_i)}{\nu} = 0$$

idle sierraBOT
broken plank
#

Assuming that we look for solutions $(r, \nu)$ such that $\nu > 0$, then we can maintain equivalence by multiplying by $\nu$

idle sierraBOT
broken plank
#

$\Longleftrightarrow \sum_{i=1}^n x_i(v_i-rx_i)= 0$

woeful dune
#

makes total sense to me, yeah

idle sierraBOT
broken plank
#

Now, we split the sum

#

$\Longleftrightarrow \sum_{i=1}^n (x_i v_i-rx_i^2)= 0$

idle sierraBOT
woeful dune
#

then we split on both sides

broken plank
#

$\Longleftrightarrow \sum_{i=1}^n x_i v_i - \sum_{i=1}^n rx_i^2= 0$

idle sierraBOT
woeful dune
#

move to the other side

#

sorry, i am saying this because it helps me follow

broken plank
#

$\Longleftrightarrow \sum_{i=1}^n rx_i^2= \sum_{i=1}^n x_i v_i$

idle sierraBOT
broken plank
#

Now, see that the r doesn't depend on the sum index

#

we can factorize it

#

$\Longleftrightarrow r \sum_{i=1}^n x_i^2= \sum_{i=1}^n x_i v_i$

idle sierraBOT
broken plank
#

Finally:

#

$\Longleftrightarrow r = \left( \sum_{i=1}^n x_i^2 \right)^{-1}\sum_{i=1}^n x_i v_i$

idle sierraBOT
broken plank
#

For a critical point $(r, \nu)$ such that $\nabla g(r, \nu) = 0$, it necessarily holds that $r$ has the above expression

idle sierraBOT
broken plank
#

Conversely (since we only used $\Longleftrightarrow$), that expression of $r$ will provide $\frac{\partial g}{\partial r}(r, \nu) = 0$

idle sierraBOT
woeful dune
#

can't we simplify that to $r= \frac{v_i}{x_i}$

#

?

idle sierraBOT
#

Jacques

broken plank
#

No

#

Here it is not clear what is i

#

so your expression does not make sense

#

here the i that I used belongs to the summation

woeful dune
#

so I need to use that entire thing you wrote out in the second one?

#

holy madlad

broken plank
#

$r = \frac{x_1 v_1 + ... + x_n v_n}{x_1^2 + ...x_n^2}$

idle sierraBOT
broken plank
woeful dune
#

jesus

#

okay

broken plank
#

But it's not the most complicated estimator of the slope

#

MLE estimator is fairly simple

#

in comparison to other ones

#

if there are any

#

Anyway, use my example to try to figure out the expression of $\nu$

idle sierraBOT
broken plank
#

Since now you can just substitute $r$

idle sierraBOT
woeful dune
#

considering we covered Monte Carlo Simulation, Data Analysis, PCA and Multilinear Regression in 4 days, i am sure there are more we've missed in our lectures

broken plank
#

sounds like data science to me

woeful dune
#

it is

broken plank
#

or stats i guess

woeful dune
#

data science

#

but doing the entire thing in 4 days was pretty stupid

broken plank
#

Then you gotta up your calc because monte carlo is going to hurt

#

pca is not too hard

woeful dune
#

i know madlad

#

oh trust me, i know

#

sigh

broken plank
#

Anyway, linear regression is one of the simplest models that can be demanded of you if you ever do a data scientist interview

#

so it'd be really beneficial for you to really understand that

#

Since I don't know if I will be here in a couple of hours, I will give you the answer so that you can check your MLE estimators

#

$\hat{r} = \left( \sum_{i=1}^n x_i^2 \right)^{-1}\sum_{i=1}^n x_i v_i$

idle sierraBOT
broken plank
#

$\hat{\nu} = \frac{1}{n}\sum_{i=1}^{n} (v_i - \hat{r} x_i)^2$

idle sierraBOT
woeful dune
#

shouldn't there be a 2

broken plank
#

No, pretty sure it's this

#

I cross checked with internet

woeful dune
#

i trust you with my entire life

broken plank
#

You can try to carry out the computations

#

And see if you land the same result

woeful dune
#

I landed on

#

$\hat{\nu} = \frac{2}{n}\sum_{i=1}^{n} (v_i - \hat{r} x_i)^2$

idle sierraBOT
#

Jacques

woeful dune
#

but i will take your word for it

#

So bear with me for just 5 minutes, we just found question 1 right

#

just to confirm (I am losing my mind)

broken plank
#

Well I will check again just to be sure

#

Ah, ok, I see the problem

#

We were asked to estimate $\sigma$, not $\sigma^2$

idle sierraBOT
broken plank
#

The answer I gave you was the estimate of sigma, which I squared

#

So, that's my bad

woeful dune
#

so the asnwer is everything you had/ sqrt?

broken plank
#

Yes, but we also wrote the wrong equations

#

Though you can keep my answer for r, it's correct

woeful dune
#

oh fuck

broken plank
#

Don't worry

#

it's a quick fix

#

$$\frac{\partial g}{\partial r} (r, \sigma) = -\sum_{i=1}^n \frac{2x_i(v_i-rx_i)}{\sigma^2} = 0$$

$$\frac{\partial g}{\partial \sigma} (r, \sigma) = \sum_{i=1}^n \frac{-2(v_i-rx_i)^2}{\sigma^3} - \frac{-n}{\sigma} = 0$$

woeful dune
#

oh my first solution was correct!

idle sierraBOT
woeful dune
#

massive win for me, this is my first win in life

broken plank
#

Yeah, my bad, I messed up because I thought it was the estimation of sigma²

woeful dune
#

please never apologize

#

again

#

you're my savior, my new christ

broken plank
#

Though it still doesn't solve the mystery of the hanging 2

#

when differentiating 1/sigma²

woeful dune
#

no, now it does

#

because in the other one, we had n/2sigma

#

right?

broken plank
#

because the 2 just appeared in the sum instead

woeful dune
#

oh shit

#

oh you're right

broken plank
#

since differentiating 1/sigma² wrt sigma

#

gives -2/sigma^3

woeful dune
#

yeah yeah

#

maybe the 2 is just supposed to be there in this case, idk?

broken plank
#

Hmmm

broken plank
#

Ah no I know why

#

It's because I'm a moron

woeful dune
broken plank
#

It's supposed to be, divided by 2 sigma²

#

so the 2 cancels out

#

again, it doesn't affect our reasoning with the finding of r

woeful dune
#

why divided by 2sigma^2

#

or is just the loglikelyhood

broken plank
#

For that, I'll need a bit of backtracking

#

You know that an observation $V_i$ follows, conditionally to $X_i$, a normal distribution

idle sierraBOT
broken plank
#

Except all $X_i$ are iid in our hypothesis

idle sierraBOT
woeful dune
#

yeah

broken plank
#

So the distribution of $V = (V_1, ..., V_n)$ is

idle sierraBOT
broken plank
#

a product

#

since they're all independent

woeful dune
#

okay

#

question

#

when given to find sigma^2, shouldn't the MLE be the biased sample variance

#

just asking, thats what i read online

broken plank
#

indeed

#

Well, the thing is

#

Since you have a product

woeful dune
#

okay, so when asked to find sigma^2 in the Gaussian case, always the biased smaple vairance

#

what will be the result in our case then

broken plank
#

The distribution of the $\epsilon_i$ will be both:

idle sierraBOT
woeful dune
#

will it just be the biased sample variance squared root/

broken plank
#

$f_{\epsilon}(v - r x) = \prod_{i=1}^{n} f_{\epsilon_{i}}(v_i - rx_i)$

#

Where $f_{\epsilon_i}$ is the cdf of a normal distribution of mean $0$ and variance $\sigma^2$

idle sierraBOT
broken plank
#

So that explains why when you take the log

idle sierraBOT
broken plank
#

$g(r, \sigma) = \sum_{i=1}^{n} \ln f_{\epsilon_i}(v_i - rx_i)$

idle sierraBOT
woeful dune
#

okay

#

i am with you so far

broken plank
#

$g(r, \sigma) = \sum_{i=1}^{n} \ln f_{\epsilon_i}(v_i - rx_i) = \sum_{i=1}^{n} \ln \left( \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( \frac{(v_i - r x_i)^2}{2\sigma^2}\right)\right)$

#

And by using the properties of the log

#

$g(r, \sigma) = \sum_{i=1}^{n} \left( - \frac{1}{2} \ln(2 \pi) - \ln \sigma + \frac{(v_i - rx_i)^2}{2\sigma^2}\right)$

idle sierraBOT
woeful dune
#

okay so thats where we were last

#

then we derived

idle sierraBOT
broken plank
#

So here, if you take out the terms and group them

#

you end up with

#

$g(r, \sigma) = - \frac{n}{2} \ln(2\pi) - n \ln \sigma + \sum_{i=1}^{n} \frac{(v_i - r x_i)^2}{2\sigma^2}$

idle sierraBOT
broken plank
#

So this is the function to optimize

#

called the log-likelihood

#

since it's literally the log of the likelihood

woeful dune
#

gotcha, so just to finish (iv'e been following so far)

#

we got r

broken plank
#

Yep

#

We know how to find the r that maximizes the likelihood

woeful dune
#

now to get sigma, is is just the biased sample variacne sqrt?

#

or is it actually different

broken plank
#

Using similar computations, yes

#

you find that it is the biased sample variance

woeful dune
#

thats for sigma^2

#

not for sigma tho

#

but sigma should be just that sqrt ig

broken plank
#

I think it's the same thing, let me check

#

$\frac{\partial g}{\partial \sigma} (r, \sigma) = 0 = -\frac{n}{\sigma} + \sum_{i=1}^{n} \frac{-2 (v_i - rx_i)^2}{2 \sigma^3}$

idle sierraBOT
broken plank
#

So grouping stuff together, we find that:

#

Argh I did a mistake again

#

I'm so mad

woeful dune
#

thats okay dw

#

rion can i ask you something else real quick

#

its still about this exercise

broken plank
#

The $f_{\epsilon_i}(t) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp\left( -\frac{t^2}{2 \sigma^2}\right)$ with a minus in the exponential

idle sierraBOT
broken plank
#

so yeah you should find what you want

#

and sure do ask

woeful dune
#

when it says find the confidence interval, how the shit would I approach that?

broken plank
#

$g(r, \sigma) = - \frac{n}{2} \ln(2\pi) - n \ln \sigma - \sum_{i=1}^{n} \frac{(v_i - r x_i)^2}{2\sigma^2}$

idle sierraBOT
woeful dune
#

yes

#

for r

broken plank
#

$\hat{r} = \left( \sum_{i=1}^{n} x_i^2 \right) \sum_{i=1}^{n} x_i v_i$

idle sierraBOT
broken plank
#

$\hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^{n} (v_i - \hat{r} x_i)^2$

#

To think about confidence intervals

#

You should think of whether $\hat{r}$ is far from $r$

idle sierraBOT
broken plank
#

Considering that your $x_i$ and $v_i$ are random variables

idle sierraBOT
broken plank
#

then $\hat{r}$ and $\hat{\sigma}$ are also random variables

idle sierraBOT
broken plank
#

So 1. what distribution do they follow?

#
  1. Can you deduce a confidence interval?
woeful dune
#

normal distribution

#

we would need to use the student table with t

broken plank
#

I'm not exactly sure if $\hat{r}$ is a student rv

idle sierraBOT
woeful dune
#

is it not?

#

since they follow normal dist, i thought they'd follow student

broken plank
#

For the $\hat{\sigma}^2$

idle sierraBOT
broken plank
#

I can see it being the case

idle sierraBOT
woeful dune
#

fischer, then?

broken plank
#

But then I don't have absolute guarantees

woeful dune
#

what the fuck then

#

i am so stupid istg

broken plank
#

Let me think for a sec

woeful dune
#

Give a confidence interval for r of significance level α, with σ
2 = 1.

#

this is the question

#

thats sigma^2

broken plank
#

For $\hat{r}$, conditionally to the $x_i$'s, maybe you can get away with some things

idle sierraBOT
broken plank
#

So if you consider the x_i constant (and not random)

woeful dune
#

mhmm

broken plank
#

You can consider the x_i constant

#

but the v_i are normal

#

N(r x_i, sigma²)

woeful dune
#

okay

broken plank
#

so divided by the sum of squares

#

$x_i v_i \sim \mathcal{N} \left( r x_i^2, x_i^2 \sigma^2 \right)$

idle sierraBOT
woeful dune
#

ok

broken plank
#

And they are all independent

woeful dune
#

mhmm

broken plank
#

so the sum is still a normal distribution

woeful dune
#

so it follows student then?

broken plank
#

No, as said it is a normal distribution

#

$\hat{r} \sim \mathcal{N}(..., ...)$

idle sierraBOT
broken plank
#

we just gotta fill the blanks

woeful dune
#

thats

#

z value

#

right

broken plank
#

Well, only if the parameters are 0 and 1

#

but they're probably not

#

Let's check

#

anyhow, for a sum of independent normals, it follows a normal

#

which mean is the sum of means

#

and the variance is the sum of variances

#

$\sum_{i=1}^{n} x_i v_i \sim \mathcal{N}(r \sum_{i=1}^{n} x_i^2, \sigma^2 \sum_{i=1}^{n} x_i^2)$

idle sierraBOT
broken plank
#

Now, if you divide that by the sum of xi²

woeful dune
#

1 and sigma^2

#

r and sigma&^2

broken plank
#

$\hat{r} \sim \mathcal{N}\left(r, \sigma^2 \frac{1}{\sum_{i=1}^n x_i^2} \right)$

idle sierraBOT
broken plank
#

So now you know the distribution of $\hat{r}$

idle sierraBOT
broken plank
#

you can do your appropriate tests

woeful dune
#

makes sense

#

thank you so much!

#

i already nominated you once, idk if i can twice

#

you're amazing

broken plank
#

Which is at least ten times more difficult

#

because I think we need a theorem or something

woeful dune
#

which other one

#

the third one?

broken plank
#

No, the distribution of $\hat{\sigma}^2$

idle sierraBOT
broken plank
#

This one is pretty hardcore to prove and to be completely honest I don't have the toolbox to prove it

#

or at least it needs quite elaborate thinking

woeful dune
#

honestly, between you and I, I don't think I have the toolbox to understand it even if you did

#

I am genuinely thankful for your help so far, this is way more than i knw before so

#

exam is tomorrow anyway, this is a joke

#

4 day to finish a data science course my uni went crazy

broken plank
#

I don't have a good proof

#

But $\frac{n \hat{\sigma}^2}{\sigma}$ follows a $\chi^2_{n-1}$

idle sierraBOT
broken plank
#

@woeful dune Check the book they mention for the proof

woeful dune
#

will do, thank you so much

broken plank
#

Wait I'm a moron

#

they never asked for the confidence interval of $\hat{\sigma}$

idle sierraBOT
broken plank
woeful dune
#

i wont lie, i didnt notice either

#

so really, i am the bigger moron

broken plank
#

Well good thing they don't

#

I can't prove it

woeful dune
#

lmao

#

great

broken plank
#

Anyway, a list of takeaways for you for your exam tomorrow

#

knowing how to compute a MLE is kind of really important

woeful dune
#

mhmm

broken plank
#

So first takeaway

#

the likelihood is just a product of likelihoods when you have iid samples

#

which justifies taking the log likelihood, since you go from product to sum

woeful dune
#

ok

broken plank
#

second takeaway:
the parameters that maximize the log-likelihood also maximize the likelihood

#

so that also justifies using the log-likelihood instead of the likelihood

#

third takeaway:
you need multivariate differential calculus to compute optima

#

just like how you take the derivative of a function to check where it's either max or min

woeful dune
#

yeah, makes sense

#

and i guess pray to god

broken plank
#

don't

#

i speak from experience, that doesn't work

#

i can't explain my losing streak in gacha games with statistics because i'm already outside the interval of confidence

woeful dune
#

lmao

#

i will pray regardless

#

tbh i am pretty terrified

broken plank
#

Why

woeful dune
#

this is what they had last year

#

I've solved everything but 1. b part 2. 4

broken plank
#

1b?

#

the monte carlo thing?

woeful dune
#

yes

#

i didnt get that

broken plank
#

I think they just ask you to sample a bunch of x_i, and use the monte carlo estimator of the mean

woeful dune
#

i dont know tbh

broken plank
#

well you know what a monte carlo estimator is right

woeful dune
#

yes

#

i mean i know the algorithm

broken plank
#

Yeah

woeful dune
#

this case its a discrete one, so i was thinking of just asnwering that with the theoretical thing

#

like just check h(x) + h(x) + h(x) all over n, until it becomes bigger than the interval, but i dont think thats correct

broken plank
#

Well it is possible to compute E[h(X)] exactly using the transfer theorem

#

but they want an approximation, not an exact value

#

How did you define monte carlo methods in your lecture

woeful dune
#

uh

#

do you want the file

#

its a powerpoint

broken plank
#

just send me pictures of it here

#

parts you think are relevant

woeful dune
woeful dune
# woeful dune

we did answered that algorithm question in this one , massive rip

#

and then for the continous rv, we have explanations with the inverse and rejection methods

#

and thats it

broken plank
woeful dune
#

we didn't*

#

he just skipped over it lol

broken plank
#

Ah ok

#

Well I mean

#

You can still do something

#

First of all, you can estimate $\mathbb{E}[Y]$ with an empirical mean of iid samples that follow the distribution of $Y$

idle sierraBOT
broken plank
#

So $\frac{1}{n} \sum_{i=1}^{n} Y_i$

idle sierraBOT
broken plank
#

So now I guess what you need is to find how to generate $Y_i$

idle sierraBOT
broken plank
#

This is not all that hard

#

You generate $X_i$ with said distribution

idle sierraBOT
broken plank
#

And you just compute $Y_i = h(X_i)$

idle sierraBOT
broken plank
#

And here, to generate $X_i$, you use the method they show in the slides

idle sierraBOT
broken plank
#

@woeful dune

#

So you cut the interval [0, 1] into bins

#

of size p1, p2, ..., pk

woeful dune
#

so basically, just choose the other stuff, so instead of .32, maybe .5

broken plank
#

(which should sum up to 1)

woeful dune
#

and compute different stuff?

broken plank
#

you generate a uniform rv

#

and you see in which bin it falls into

#

and you take the corresponding value

woeful dune
#

so this exam is written

#

since i can't exactly generate

#

should i just take random values

#

over and over

broken plank
#

No, here they ask you to create a scheme

#

so a method

#

you don't use the method but you explain that you can

woeful dune
#

so basically write out an explanation?

broken plank
#

That's what a scheme is isn't it?

#

Just a theoretical framework

#

Though then I don't quite understand the error part

#

the precision I mean

#

But I assume that it has to do with the number of samples you create, which is n

woeful dune
#

something like

rand U
X ~ U(w)
Find h(x)
Keep doing this until E[Summation H(x)] > e

broken plank
#

The standard deviation of $\bar{Y}n = \frac{1}{n}\sum{i=1}Y_i$ is $\frac{V(Y1)}{n^2}$

idle sierraBOT
broken plank
#

Ah yeah that's fine

#

Using the Chebyshev inequality

woeful dune
#

so basically I should say

#

as long as theta < Var [h(x)]/ ne^2, keep generating

#

got it

broken plank
#

Well, essentially, the idea is

#

you need $n$ to be large enough

idle sierraBOT
broken plank
#

But the question asks for how large

woeful dune
#

yeah, thats question c

broken plank
#

Oh ok

#

I didn't see

#

yeah my bad

woeful dune
#

yeah question c is eaasier since he basically tells us to use a C

broken plank
#

Just keep in mind

#

I think you are expected to compute Var(h(X1))

#

or rather Var(h(X))

#

but you can do that by hand

woeful dune
#

how would i even compute var(h(x)) in that case

broken plank
#

It's just E[(h(X) - E[h(X)])²], and given that X is discrete with just 5 or 6 values

#

it's not a long computation

woeful dune
#

-2, -1, 0, 2, 3

#

that would be what, E[h(x) - E[h(x)^2]

#

let me ask you the dumbest question yet

#

how would I find the E of it for only one value

broken plank
#

wdym

#

for a discrete rv X, the expectation of X is just the sum of x_i p_i

#

where p_i is P(X = x_i)

#

and using the transfer theorem

#

the expectation of h(X) is the sum of h(x_i) p_i

#

$E[h(X)] = \sum_{i=1}^{K} h(x_i) P(X = x_i)$

idle sierraBOT
broken plank
#

and here you have like 5 values

#

it's just a sum of 5 terms

#

likewise $E[h(X)^2] = \sum_{i=1}^{K} h(x_i)^2 P(X = x_i)$

idle sierraBOT
woeful dune
#

then what is X - E[h(X)^2]

broken plank
#

The variance of a rv Y is the expectation of (Y - E[Y])²

#

so E[(Y - E[Y])²]

#

keep in mind that E[Y] is a number here

woeful dune
#

in that, case, what do I substitute for Y

#

like

#

one of the values?

broken plank
#

There is a formula that says that V(Y) = E[Y²] - E[Y]²

#

so you can compute those if it's easier for you

broken plank
woeful dune
#

and then the mean of that, will be found how?

#

so its number - number

#

its E[number] which means what exactly

broken plank
#

Y is a random variable

#

E[Y] is a number (or, I guess, in perfectly accurate terms, a constant random variable that is always equal to the same number)

woeful dune
#

so in my case i would do var (h(x)) = E[h(x^2)) - E[h(x)]^2

broken plank
#

No

woeful dune
#

oh

broken plank
#

E[h(X)²] - E[h(X)]²

#

it's h(X) that is squared in the first one

#

not h(X²)

woeful dune
#

gotcha

#

thats way esier

#

in the least romantic terms, i could kiss you right now

broken plank
#

I will pass on that but I appreciate the gratitude

woeful dune
#

can I give you a nitro as thanks or is that against the rules

broken plank
#

I don't know, I mean I don't think anyone will complain if you message me privately

#

I complain when people ask me for help in private, because I don't like it and the rules explicitly say not to do so

#

I didn't help you to receive anything in return though

woeful dune
#

meh, after days of asking for help in math discords, finally someone who could actually help me (and somehow had fucking infinite patience)

#

plus, i wouldn't be surprised if i asked another question in the future, so whatever

#

gratitude is good with words, but i like sending gifts too

#

shit, ny card isn't working

#

i will DM you it later, thank you so much again @broken plank

crisp rockBOT
#

@woeful dune has given 1 rep to @broken plank

broken plank
#

I'm helping someone else with another q

woeful dune
#

@broken plank shit, whenever you see this, we never actually get the estimators to the end and i am lost on how to do it

#

if you see this and write it out, thank you

woeful dune
#

But never actually got what sigma is

broken plank
#

We did

#

We got both estimators

#

The MLE estimator of sigma is the biased empirical variance of the samples

woeful dune
#

Isn't-that sigma^2

#

This is only sigma

broken plank
#

Ah yeah mb

#

Well the square root of that

woeful dune
#

Okay thank you

#

30 hours of no sleep and I couldnt find where we did that