#Some questions about Bayesian Stats

44 messages · Page 1 of 1 (latest)

sharp inlet
#

E_theta[p(y|x, theta)] - what is this? What is it a function of?
I understand that p(y|x) is the “full” Bayesian Inference for our problem and is the distribution of the label y given a point.

limber rootBOT
#
  1. Ask your question and show the work you've done so far. If you've posted a screenshot of a question, specify which part you need help with.
  2. Wait patiently for a helper to come along.
  3. Once someone helps you, say thank you and close the thread with:
    +close
    
  4. Feel free to nominate the person for helper of the week in #helper-nominations
  5. Do not ping the mods, unless someone is breaking the rules.
  6. If you're happy with the help you got here, and the server overall, you can contribute financially as well:
sharp inlet
#

bump pandalove \

open shale
#

E_theta[p(y|x, theta)] - what is this thing? is it related to LMS estimator in any way. To calculate this expectation wrt theta, we integrate the argument wrt theta weighted by the updated distribution of theta.. makes sense. Unsure why this integral gives us a distribution instead of a number.
E_theta[p(y|x, theta)] is the expected value of p(y|x, theta) where you treat theta like a known constant instead of a variable.
The result is a distribution because E_theta[p(y|x, theta)] still depends on x

cosmic nest
#

I feel as though this might be a bit incorrect, one should have that:
$$p(y\vert x) = E_{\Theta}[p(y \vert x, \Theta)] = \int_{\theta} p(y \vert x, \theta) p_\Theta(\theta) d\theta$$

mint pythonBOT
cosmic nest
#

Where $\Theta$ is the random variable corresponding to the a priori distribution on the parameters

mint pythonBOT
sharp inlet
#

is it also correct to view this as averaging the likelihoods over the posterior?

sharp inlet
# mint python **Rion**

Here I'm unsure what's a random variable and what's a constant. To my understanding x is a constant since it's the previously unseen test point that we want to make a prediction on. y is a random variable for the possible label value (also follows from linear regression being discriminative). Then \Theta is a random variable - the posterior distribution. is this correct

sharp inlet
#

And what would each of these objects look like in ridge regression

cosmic nest
#

You have some parametric distributions such as, say, a binomial one

#

or a Bernoulli one

#

or even a normal distribution, that's also parametric

#

So for instance, N(m, s²) has parameters theta = (m, s²)

#

but the prior distribution on the parameters means that theta is random too

cosmic nest
#

So essentially, see that a data point $x$ follows a distribution that is parametrized by $\theta$, say $p_{X \vert \theta}$, which is called the prior distribution. However, $\theta$ is RANDOM, and it follows a distribution $p_{\Theta}$. So you have a reasoning in two steps here:

  • Firstly, guess $\theta$ based on supporting evidence (your dataset $D$)

  • Secondly, use that to estimate the distribution of $x$, and obtain a predictive distribution

mint pythonBOT
cosmic nest
#

Example: linear regression. The data points $(x, y)$ follow the a certain distribution, given by:
$$y = \langle w, x \rangle + \epsilon$$

Where $\epsilon \sim \mathcal{N}(0, \sigma^2)$. So here, you have two parameters to guess: $\theta = (w, \sigma^2)$.

mint pythonBOT
sharp inlet
#

👀

cosmic nest
#

In the typical linear regression problem without regularization, what you effectively do is to maximize the likelihood, as in:
$$\hat{\theta}{MLE} = \arg\max{\theta} p_{X \vert \theta}(D \vert \theta)$$
However, what the MAP does is different, i.e.
$$\hat{\theta}{MAP} = \arg\max{\theta} p_{X \vert \theta}(D \vert \theta)p_\Theta(\theta)$$

mint pythonBOT
sharp inlet
cosmic nest
#

Now, being also super lazy, we actually suppose we also know the distribution $p_{\Theta}(\theta) = \frac{1}{\sqrt{2\pi}} \exp\left(-\frac{1}{2} \theta^2 \right)$

mint pythonBOT
cosmic nest
#

So now, with that being give, you maximize just as you did before

sharp inlet
cosmic nest
sharp inlet
cosmic nest
#

like, for instance, in a regression problem, you find the slope, then you use the slope to make a predictor

#

it's not that much different

sharp inlet
#

yes. we find this slope by going over all the training data in D

cosmic nest
#

Yes, that is correct

#

That is what p(theta | D) is for

#

guessing which theta is more likely to control the distribution, assuming that we have a bunch of data points of that distribution already

sharp inlet
#

Ok, I see. That answers my question on what is and isn’t a random variable

#

ohh i understand now

#

tysm!

#

+close