#With XState v5, it seems that restored actors don't restore timers automatically

1 messages · Page 1 of 1 (latest)

tacit anvil
#

By discussing with David (<#1105673412216619148 message>), I tried to use XState v5 to persist and restore a machine that creates timers (after transitions).

It seems that timers are not recreated when a machine is restored. I would expect timers to be recreated when the machine is restored.

My reproduction: https://stackblitz.com/edit/typescript-ujewoz?file=index.ts.

Did I do something wrong? Thanks for your help.

Blank starter project for building TypeScript apps.

hallow gulch
#

Timers aren't restored yet. We're still trying to determine the proper behavior here, maybe you can help

#

There's different things that can happen. Let's say that you have a 10 second timer, and you stopped the machine 2 seconds in, and restored it 2 seconds later. What should happen?

#
  1. Don't run the timer again regardless (current behavior, probably not expected)
  2. Run the timer for 8 seconds, since it was stopped at 2 seconds
  3. Run the timer for 6 seconds, since it was restored at 4 seconds from when it was started
  4. Rerun the timer from the beginning for 10 seconds
#

Ideally this should somehow be configurable

#

Maybe something like...

after: {
  2000: {
    target: 'goHere',
    restore: true, // default restart policy: start from beginning
    // restore: 'restart' (4) (default)
    // restore: 'timestamp' (3)
    // restore: 'elapsed' (2)
    // restore: false (1)
  }
}

What do you think?

#

Replay behaviors can also extend to other actors in general:

invoke: {
  src: 'someSource',
  restore: true, // restart the invocation on restoration

  restore: false, // do not restart (treat as fire-forget)
}
summer mist
# hallow gulch 1. Don't run the timer again regardless (current behavior, probably not expected...

Temporal already has some solid policies for dealing with this kind of situation, maybe we can use it as inspiration (so its compatible). Specifically backfill and catchup window. See below from https://docs.temporal.io/workflows

"Catchup Window​

The Temporal Cluster might be down or unavailable at the time when a Schedule should take an Action. When it comes back up, the Catchup Window controls which missed Actions should be taken at that point. The default is one minute, which means that the Schedule attempts to take any Actions that wouldn't be more than one minute late. An outage that lasts longer than the Catchup Window could lead to missed Actions. (But you can always Backfill.)"

#

--

#

Consider the case that you have multiple subsequent "after" actions, each lasting 1 second. I'd expect that setting the policy to catchup would automatically skip afters until the right time was reached.

But I think you are on the right track, having multiple policies to configure what happens after a machine is restored. Yeah I like the policies start from the beginning, catchup to the current timestamp, from paused timestamp.

summer mist
#

--

#

@tacit anvil Would a possible workaround be to avoid using native xstate timers altogether and just create a state which invokes a service that wraps wf.sleep? Therefore, Xstate will hydrate in the right state and temporal will control the catchup policy.

summer mist
#

@tacit anvil or does that fall apart if you have parallel machines

tacit anvil
#

Thank you @hallow gulch for your answer.

I think having a way to specify how individual timers and actors are restored is great. Though, I think that, for timers at least, we should also be able to change the default behavior for a whole machine.

Options 2 and 3 would require to control timers a bit more than they are currently I think, like knowing when they were started. Would you expose these data publicly, allowing to know from the outside when a timer was started?

tacit anvil
# summer mist <@278658471405289472> Would a possible workaround be to avoid using native xstat...

@summer mist Yes, it might be a solution for now! But what I found interesting with the integration of XState in Temporal workflows was that there was actually nothing to do for both to work together.
At first I thought I had to reimplement the clock used by XState (setTimeout and clearTimeout) to use sleep function from Temporal, but it turned out to be useless, as Temporal correctly implements setTimeout and clearTimeout.

tacit anvil
summer mist
#

Are there any other areas you noticed where they dont play well?

tacit anvil
# summer mist I'll have to play around with it. So youre saying the problem is basically that ...

With XState v4, there is an issue with calling continueAsNew, because all actions are persisted in the state, including actions cancelling timers, and this one does not play well with Temporal, as when we restore the machine it tries to cancel a timer that doesn't exist. But with a workaround it works well.

With XState v5, as David said, timers are not persisted nor restored for now, so Temporal workflows can be stuck, waiting for a timer that was never started to complete.

Apart from this concern that will be addressed, XState and Temporal play absolutely well together from my experience!

summer mist
dusty lake
#

@summer mist @hallow gulch I'm wondering if there was a concrete outcome from this discussion.

We're working on a browser like history stack, back, forward, and pop as higher-level actor logic. This includes an optional storage interface.

Where on pop, we would like invoked child actors to be re-executed after restoring the parents snapshot.

I saw a thread on Github where @hallow gulch mentioned something like

What if an invoked actor runs a post request, you might not want to do that again.

In our case, we are in fact doing that, and we would like it to happen again 🙂

Is there currently a way to re-invoke an actor after restoring?

function withHistory(machine) {
  return {
    ..machine,
    transition() {
      // event.type === history.pop
      // restoreSnapshot
      // reinvoke child actor with input
      // input should be the history.pop event (with callback data)
      return nextState
    }
  }
}

Any thoughts on this?

hallow gulch
#

If you persist/replay events instead of state, you can achieve this

analog fossil
#

Did anyone wind up building a working example of XState v5 and Temporal? I'm also running into this issue with timers not picking up correctly (resulting in an error: Missing associated machine for Timer(0)). Baptiste's v4 repos do not appear to do any kind of explicit persistence, so I'm struggling to see how I would integrate this advice ^ (i.e. where/how do you actually do persistence? it seems to be "automatic"/inherent in the v4 examples).

#

Would a possible workaround be to avoid using native xstate timers altogether and just create a state which invokes a service that wraps wf.sleep? Therefore, Xstate will hydrate in the right state and temporal will control the catchup policy.

I could possibly see a path to this, although not being able to define after transitions / having to do those a special kind of action feels a little unfortunate. Also wouldn't the workflow sleep immediately put it to sleep and not let it complete other regular actions? This is definitely stretching thin my current knowledge on Temporal and XState 😄

hallow gulch
analog fossil
#

Internal codebase... Let me see if I can pull out a minimal repro

analog fossil
hallow gulch
#

Thank you!

analog fossil
#

Oh doy, that is from this very thread as the "not working" example haha

#

Added a minimal Temporal worker

analog fossil
#

Any thoughts on this? Really hoping I can get v5 to work with delayed transitions 😄

analog fossil
vital hornet
#

Author here. Yes…there’s an implementation of persistent timers there. They can be resumed or modified on continueAsNew.

I don’t know how to achieve the same with v5 yet. The general approach was to intercept/proxy every delaySend event and store an internal map with the start time, timer name, and state value the timer originated from. When timers fire they’re cleared from the map. When the workflow continues, the map is serialized and sent to the new workflow. They can be modified by the migration function (deleted, or with different delays). The remaining duration is a computed as a function of the elapsed time (now minus start) and a desired delay (whatever you returned from the migration function or whatever your new machine says it should be). The code that starts the machine starts the timers using the unproxied delaySend I think.

analog fossil
#

Sounds like this is going to be sort of tough to get working hmm

vital hornet
#

It is yes but there were significant changes after that beta

calm flame
#

I take it there is no progress here?

#

Sorry for necropost but I was playing around with this and stumbled upon the same issue on all workflows not just those with delays set up

calm flame
#

nevermind, can confirm that waitFor is also affected which is why removing the timer didn't appear to do anything.

calm flame
#

xstate seems to be always passing back undefined when clearing the timer....

#

that will do it!

#

waitFor also tried to clear timeout with undefined id

#

solution for now is as follows

const clearTimeoutInner = global.clearTimeout;
global.clearTimeout = (id) => {
    if (id !== undefined) {
        clearTimeoutInner(id);
    } else {
        console.log("INVALID TIMEOUT");
    }
};

seems like xstate needs to ensure that these ids are set properly, neither waitFor nor after clears timeouts correctly...

#

looks like ignoring 'invalid' ones is OK

hallow gulch
calm flame
#

sure! from what I understand xstate 4 + temporal works nicely, and I have discovered scattered threads from people struggling to get xstate 5 working. based on some digging I did, the underlying issue is that temporal doesn't handle undefined being passed to clearTimeout, which xstate 5 seems to do frequently for some reason. have not discovered the root cause of that.

Passing undefined to clearTimeout is what triggers the issue described by this message in this thread #1105930226539692152 message so I have as a workaround overridden clearTimeout to just log a warning in this case and the error no longer appears

calm flame
#

while I am here, seems that creating an actor via fromPromise is also not working due to temporal workflows disallowing AbortController

#

monkeypatching that out seems to work, and now the state machine is fully functional 🎉

hallow gulch
hallow gulch
calm flame
#

I am happy to do both! Thanks for listening to my stream of consciousness 🙂 I will try to get a PR for the first item up and we can perhaps discuss a design for how xstate can work with temporals cancellation primitives

#

Will open issues for both before starting on the PR. have a nice day