#Should I worry if I sometimes get this error using the Workflow component?
15 messages · Page 1 of 1 (latest)
@magic saffron ^
This would indicate that there were multiple attempts to call the same onComplete (for same invocation). This is a check to ensure only one ever goes through. My only rationale for this would be a job that failed just as the "recovery" cron was checking in on old action attempts. They'd both try to report the error at the same time.
Does it print the id? If so it'd be interesting to investigate the data to see what the result was, if that data is still around.
Got the id, but the data was lost, I couldn't find it with a custom query on the workflow component in the dashboard.
I will try to catch what the data ends up on the next time it happens.
Got one, but couldn't find the data with db.get.
The logs that might be important are:
~20:31:29 run A of workflow starts
~20:31:33 run B of workflow starts after canceling* A. A was waiting 9s with runAfter.
~20:31:39 the action scheduled on runAfter runs (from A!), and the error apears. The run A doesn't continue.
~20:31:43 the action scheduled on runAfter runs from B, and there isn't an error afterwards
So the action is called even thought the workflow was canceled. In the 2nd screenshot, i filtered most of my logs, but kept the 2 first ones that are where just after runAfter, and then the action that ends up with the error on onComplete of run A. Just above that, that cound't fit on the screenshot, is a similar log but coming from run B.
Small repro https://github.com/benjavicente/convex-workflows-oncomplete-error/blob/main/convex/example.ts
Contribute to benjavicente/convex-workflows-oncomplete-error development by creating an account on GitHub.
Thanks! Yeah during cancellation any action that's been started will run - but when canceling it before it starts (before the delay elapses) I'd expect to prevent it from running. Maybe the cancel isn't propagating immediately into the workpool and is only marking it as canceled in the workflow layer
So it is behaving somewhat as expected here - the onComplete of A is called with the cancellation, then the second is prevented from running. But ideally the second would never have occurred since the cancellation was before it executed.
I'll dig in more
Hi all, I'm getting this error too - during the cancellation of a workflow run. Is there a way to prevent or fix this? Or should I not be concerned about this error?
Thanks in advance.
No need for concern, but I just filed https://github.com/get-convex/workflow/issues/43 to track it, and have a PR up that should address it: https://github.com/get-convex/workflow/pull/44
We're too aggressively checking generation numbers when canceling workflows. It currently throws when a step finishes (or was canceled) after a workflow is canceled. It should gracefully bail i...
Ian, you're amazing, thanks for your help! I assume the design pattern here is to run any cancellation / clean up code, in the mutation or action that cancels the workflow, instead of in the onComplete? Not sure if due to this error or not but I noticed onComplete didn't run - or at least - didn't run the any of my code in the onComplete wrapped in an if statement for result.kind === "canceled"
The onComplete should get called with kind === "canceled" if you cancel it - if that's not the case it's a bug. It's working for me, so worth double-checking
Ok great, perhaps an issue on my end in regards to that so I'll dig deeper. Thanks Ian!
@magic saffron I also appear to not be seeing the onComplete run when canceling a workflow. The same onComplete is running correctly when the workflow succeeds.
I've tried logging to the console within the onComplete, but that doesn't seem to be reliable within the Workflow component. I've also tried inserting a row into a table (as a test) and no new row was created
if (args.result.kind === "canceled") {
await ctx.db.insert("users", {
name: args.result.kind,
personalEmail: "[email protected]",
status: "not-ready",
});
}
Have you upgraded to v0.2.5 (the latest version as of now)?
Ah, no. I was on v0.2.4