#tensorflow proj
196 messages · Page 1 of 1 (latest)
- Wait patiently for a helper to come along.
- Once someone helps you, say thank you and close the thread with:
+close
- Feel free to nominate the person for helper of the week in #helper-nominations
- Do not ping the mods, unless someone is breaking the rules.
- If you're happy with the help you got here, and the server overall, you can contribute financially as well:
If u can provide those 3 questions so I give u the best answers from last night @hallow garden
Sure
I am just asking questions so that we can contextualize, as accurately as possible, your problem
And identify your needs
First of all: What is your dataset, i.e. what are the input features (each column), and what is the target?
Assuming that you have a system that takes this input and outputs something, you want the input to be accurately known
- The dataset is the results of an ordinary differential equation solved in Lyupanov time where the time span is 1000. Is that a good enough answer
(Given whatever params)
Well I want to know exactly what is X and what is y, in machine learning terms
At a certain time step t, you have one row of inputs
X is the input correct? And y is the ground truth in this case
Yep so that ground truth is our prediction target but it’s given to us already
Well, yes, but I want to know what it physically corresponds to
You said earlier that it's your displacement x(t)
in physics terms
Yeah correct
But what are the input features of the system?
What do you need to predict x(t) from
You would need alpha beta gamma delta omega
Like, idk, temperature? overall system height?
The model should make it more clear lemme send it
Also, let's change notations
Like here, those different variables represent thing such as damping, amplitude etc
Let's name the displacement y(t)
because y also corresponds to target notation in ML
So input data would be denoted x(t)
So your dataset should be sequences (x, y) such that you want to predict y from x, with both being sequences
Ok so we can do this even in the case we’re predicting the ground truth given we already have it
Yes, but just to be clear, just because you have a ground truth doesn't mean it's input data, yeah?
to your system
Yea that’s true
It's used to train your model
so that your model learns to map x to y
but that doesn't mean x = y
otherwise you already have a trivial predictor
which does nothing to x
And my question remains: what is x comprised of?
I haven't got an answer to this yet
X is the input
yes but what is in that input
example: x(t) = [temperature at t, acceleration at t, speed at t, ...]
Such as?
Alpha beta gamma delta
Try to write it in the same format as me above
x(t) should be a finite 1D array with certain components
which is what I am asking you about rn
what are each of those components
Okay:
X and y aree simply the split of the solution in lyupanov time.
X is the first third and y is last 2/3rds.
Oh ok hold up I just read that
Ok, so the sequence x would be: [displacement at t0, ..., displacement at t0 + n]
and y would be: [displacement at t0+n+1, ..., displacement at t0 + n + k]
Yeah
Ok, very good
Exactly what I wanted to know
so here x(t) is just a 1D array with 1 number
which is: displacement @ t
Good start
Now the second question: your task is to predict y from x, but what loss function do you use?
Like MSE?
Sounds pretty ok to me
So you have a system that outputs y_hat from x
and you compute ||y_hat - y||² / k
and that is your loss per sample
yes?
Ok to follow along lets back track for sec
Up to this point is where we got (x train and y train) which is fine..right?
what is this "solution" array?
That’s the solution from the equation
what is its shape?
The solution?
The "solution" array
It’s 2D
what are its dimensions?
2 dimensional
... so it's in the shape (n, p) with certain n and p, right?
Yeah
Sorry you're not answering my question rn
Like what they represent?
[[0, 1],
[[0, 1],
[[0, 1]] is of shape (3, 2)
and is two-dimensional
I assume your array looks like that but with a different number of rows and columns
so how many is my question
and what do they represent
What is the second column?
Or rather what are the two columns for
position coordinates in the plane?
Like for example [[0, 1]] are u asking what 0 , 1 represent?
I assume the 1000 stands for the time period where you observe
and the 2 represents what?
Yes that’s correct
coordinates ij?
The dimensions because it’s 2D array
Well yes I know it's a 2D array, I just want to know why you have two columns and not an arbitrary amount of columns
It’s just an array of each solved equation at each time point
is it because the displacement is represented in two dimensions in the plane?
Yes
as in, x(t) = [i(t), j(t)]
Ok, that was my question
I have a different suggestion for you
Something that may be more helpful or less, who knows
Assume the sequence of displacements is s
so s(t) is the state of your system at t
So you have all s already
Meaning the time span?
Basically, i(t) and j(t) for all t
You simulate the system with your program
and obtain s(t)
for all t
which is the solution array that you have
of shape (1000, 2)
We will make dataset from this
a X and Y
ok
x:
[s(t0),
s(t0+1),
...
s(t0+p]
and instead of predicting the next k
you will only predict y = s(t0+p+1)
so you have a chunk of time frame p
So like individual time steps?
and you predict the displacement at the next instant
yes
why is it going to be equivalent?
because now, if you want to predict the whole sequence
you predict s(t0+p+1)
I tried something like that but idk if it’s conceptually correct if I can show u
then you plug it into the sequence and make a new x
x_new = [s(t0+1), ..., s(t0+p), y]
and then you predict the next again
and you keep doing this
I'm just thinking that 1 target prediction may be better for your neural network
rather than an entire time window
I mean that's pretty sound if you ask me
So that would be your training data
you can simulate some more for testing
but anyway
I’m gonna test it out
Back to your architecture
I fear that a LSTM may not be the most appropriate
Try a bi-lstm
Sadly it’s required for this
Well
BiLSTM is generally more efficient?
Technically still lstm so
There are plenty of other architectures that work on sequential data, but if you're bent on using LSTMs
you may as well use bidirectional lstms
the idea is that you read your input sequence both from left to right and from right to left
Also, more regarding your architecture
After your LSTM, if one dense layer does not suffice, add an entire MLP
i.e. consecutive dense layers with activations
@hallow garden give me like 10 minutes I’m gonna run some tests and get back
Sure
Yeah
If you're using regularization in your network, it'd be a good idea to rescale your data, yes
I am
Even if physically it may not make sense
Then yes rescale your data
And apply the inverse scaling manually as postprocessing to examine your predictions
It makes a lot of sense because if for example your last dense layer is regularized, it can only reach so far
in terms of prediction norm
Fixed it but i got diff error now with the graphing
@hallow garden yea the shape is not the issue
Mb
Now it’s just the graphing I need to fix to visualize it
I will try to replicate the experiment on my side, out of curiosity
Can you give me the code for making the dataset?
@fossil root
Yes
@hallow garden got it?