#Help !! Which statistical tool do I use? [for research, beginner]
107 messages · Page 1 of 1 (latest)
- Ask your question and show the work you've done so far. If you've posted a screenshot of a question, specify which part you need help with.
- Wait patiently for a helper to come along.
- Once someone helps you, say thank you and close the thread with:
+close - Feel free to nominate the person for helper of the week in #helper-nominations
- Do not ping the mods, unless someone is breaking the rules.
- If you're happy with the help you got here, and the server overall, you can contribute financially as well:
You could try a simple linear regression first maybe
For starters
Ohhh that's true :00
You have data points X and you want to predict Y, in the form <w, X> + b
what if the points are too scattered tho? that means i cant rely on the regression model to write an equation for prediction ryt??
Where w is a vector and < . , . > is the dot product
Well linear regression is one model
To fit the best plane to fit your data
If you feel that a linear model is not appropriate, there must be something else that can be used
I'm just saying linear regression because it is fairly simple and beginner friendly
Otherwise in general you look for a feature map f so that you compute predictions Y ~ <w, f(X)> + b
is there any other statistical tool that can be used ? i think it's okay for us to have slightly more complex ones since we're not actually gonna perform them
we just had to state which tool we're gonna use
You could look into machine learning
Two decades ago i think they used SVMs and whatnot
Decision trees too
It may sound a bit daunting but machine learning offers many other methods of regression
That are mostly more flexible than linear regression
That being said, usually the simpler the model the better
ohhh thats interestinggg
truee : P
i see i see
i rlly dont know stuff so im sorry xDD but likeee
can an equation still be formed from this and the like ?
or nahh ?
our research topic rlly got us
making equations and allat
lol
like it's literally in our title :((
inevitably we have to write one
For a svm?
Understand the mathematical formulation of linear and nonlinear SVM regression problems and solver algorithms.
This is the complicated version
i'm def gonna consider ittt
in what ways is it better than just plain linear ?
i trust u thooo that it's more flexible n stuff : D
i just gootta know how to justify it to my group and my adviser too
Yeah, I can explain
So essentially, the idea is as follows
You have two random variables $X \in \mathbb{R}^p$ and $Y \in \mathbb{R}$, you want to find out how they are related
Rion
So what you usually do is an estimation: $\hat{Y} = f(X)$, and you want your estimation not to be too far from $Y$
Rion
So the entire question is, how do you build $f$ ?
Rion
Here, what we call a model, you can think of it as a set of assumptions over the distribution of $X$ and the distribution of $Y$
Rion
For instance, the linear regression model assumes that $Y = \langle w, X \rangle + b + \epsilon$, where $\epsilon \sim \mathcal{N}(0, \sigma^2)$
Rion
In other words, a linear relationship with X, plus a small error
So you obviously want to find the w and b that fit best
So, here, your estimation of $Y$ is $\hat{Y} = \langle w, X \rangle + b$
Rion
And this is a fairly simple model
because it's just linear
$\hat{Y} = f_{w,b}(X) = \langle w, X \rangle + b$
Rion
But in general, there are models that come up with more complicated estimations of Y
In which case, without losing generality, you can denote them:
$\hat{Y} = f_{\theta}(X)$
Rion
Where $\theta$ is the parameter of your model, which you want to fit
Rion
yahhh 
Ok, that's very good
Because if you understand up to now
you basically understood machine learning
We make a model for the data, and we optimize the parameter $\theta$ to fit data the best
Rion
And here, support vector machine is one such model
with its own specific set of parameters $\theta$
Rion
And there are many other models, with their own sets of assumptions and their $f_\theta$
Rion
It just turns out that in linear regression, $\theta = (w, b)$ and that the estimation $f_\theta$ is fairly simple
Rion
That's why, if linear regression does not suffice, perhaps you need a better model
that isn't so simple, but may accurately model the variety of your data
yeahhh : D
And a couple of decades ago, before neural networks were popular, people liked SVMs a lot
But since nowadays you have more modern techniques to do regression, nowadays SVM is just a good math textbook exercise
by no means it is bad, just to be clear
definitely an affordable method to see if you just want to look around
but of course there are other things than SVM too
gaussian mixture
linear discriminant analysis (though mainly for classification)
ooooo~ i'll look into themm i can understand if i rlly tried xDD
nice knowing regressions r rlly the ones for thiss
i thought mayb i was geeking T__T
If my 10 minutes crash course got to you, then you understood basically what regression using ML is like
I do advise you to talk with your teacher or professor about this too
Maybe they will have better insights to provide
I'm very biased because I do research in AI so there's that
that's fiyahhh
will dooo
thanksss brahh
You're welcome
? And which statistical tool is most appropriate? Thankss!! :3
