#Next wave of Critter Stack Fixes

1 messages · Page 1 of 1 (latest)

vestal flint
#

Marten 8 & Wolverine 4 are having maybe a little more teething pains than a normal big release like this. There's another wave of fixes coming by the end of my Monday for:

  • Weasel command line tool usages. db-dump etc.
  • Alba project startup problems
  • Wolverine HTTP code generation issues now that 4.* has a race condition that wasn't a problem before
  • A couple other Wolverine issues
  • Whatever gets opened up in the next 18 hours or so:-?
vestal flint
#

And some other CLI things for Wolverine. Nothing that’s that big, but a couple things that cross projects

vestal flint
#

This is taking a bit. I'm working over the CLI diagnostics and Weasel command line tools (db-apply et al) for some of the changes that happened in Marten 8 / wolverine 4. A client stumbled into some of this

#

The dotnet run -- describe is a lot nicer looking now, and does include both Marten configuration and Wolverine.HTTP configuration I think for the first time

vestal flint
#

@slim oriole And I think the Alba "fix"from JasperFx is in that too. Some Wolverine fixes, and a lot of improvements to the CLI support for this last round. Blog post from me on that soon-ish

#

Thanks to quite a few contributors, especially with Wolverine!

slim oriole
idle solstice
#

Anything in the latest marten release that could break the health checks?

After upgrading I get:

Daemon projection high water detection was not able to determine a safe harbor sequence

But all sequences look fine in the datase. Could this be due something weird on the schema stuff in the query or something like that? Not really sure where to look yet.

vestal flint
#

I don’t know, I haven’t touched those at all

idle solstice
#

yeah, I am trying to debug it. Maybe it is in the partitioning or scheme name or something like that

vestal flint
#

The high water mark isn’t just the sequence though

idle solstice
#

I sometimes hit the return that gives me a number, sometimes return null in GapDetector. Is that what's causing the issue maybe?

#

the bottom one returns 7348. the top one nothing

#

something to do with that lead() query

vestal flint
#

I don't know, I have actually never really looked that hard into the health checks. Those all came from a PR from outside. What do they actually do?

#

I would have said to just query the progression statistics and never to be directly mucking w/ those queries

#

I'm not willing to say that there's any issue in the HighWaterDetector, but again, I'd change the health checks so that it doesn't touch that directly anyway

#

Something that's worth revisiting soon I guess. I meant to work on that for Marten 8, but there wasn't enough time

idle solstice
#

I think the healthchecks are implicit if you turn on wolverine distribution

#

that's why I am hitting this

vestal flint
#

That's not true. There's no connectivity between those two things

idle solstice
#

okay, maybe it is only spamming the console if I turn that off for the health checks for now

#
// .AddMartenAsyncDaemonHealthCheck(maxEventLag: 500);

If I turn that off, I still get an unending repetition of this message

#

I can filter that out ofc, but well

vestal flint
#

Yeah, that's the high water agent that has to run

idle solstice
#

yeah, I get that, but I think this should not be here right?

Daemon projection high water detection was not able to determine a safe harbor sequence, will try again soon

vestal flint
#

Why is your system running unhealthy? That would only be happening if you're having either very slow transactions or if you're using "Rich" appending and having transaction failures

idle solstice
#

It looks fine to me 🤷

vestal flint
#

I have no idea what's going on for you man. Is the sequence off maybe? That's not normal functioning at all

#

Is the Sequence higher than the high water mark though?

idle solstice
#

AppendMode = EventAppendMode.Quick,

vestal flint
#

I've got nothing. We did have a case w/ a client last week where the Postgresql sequence went haywire during a maintenance window

idle solstice
#

It does look like it is on the good number right?

vestal flint
#

Check the sequence

idle solstice
#

I am looking at the wrong table?

vestal flint
#

I'm theorizing that the sequence might be higher than the high water mark, but no activity is coming in

#

The sequence is fed from a PostgreSQL sequence. I'm trying to get you to check the current value of that

idle solstice
#

hmm, I think I do not know how and where to get that. So that's why you keep repeating the same message to me.

#

Because you went out of my context vocabulary

vestal flint
#

Just a second. I think there's an exposed API in Marten to get at it. Let me pull open the code

#

IDocumentStore.Advanced.FetchEventStoreStatistics() -- it's an expensive operation though

#

You want to check the value of mt_events_sequence in whatever your schema is

idle solstice
#

I think it is 1?

#

okay, you found the problem 😄

vestal flint
#

Wait, what? The sequence value is 1?

idle solstice
#

maybe because of the partitioning feature?

vestal flint
#

that's not impacted by partitioning

idle solstice
#

okay, weird. So I suppose something reset it right?

vestal flint
#

Yes. I'd expect you to be having all kinds of issues if that was off. Try resetting the sequence to the highest sequence id in your mt_events table, which is hopefully the value of the high water mark in the progression table?

idle solstice
#

yes that would

vestal flint
#

I would have expected you to get tons of postgres errors if you were trying to append events w/ the sequence off like that

idle solstice
#

Maybe me or a collegue did something wrong when upgrading to latest marten...

#

But I do not have any clue on how this happened

vestal flint
#

I can't imagine how the migrations would have reset the sequence

idle solstice
#

yeah same

#

I will report back if it happens again in another env

vestal flint
#

Just look at the SQL that's logged for the migration and see if it's modifying that sequence for some reason

idle solstice
#

yeah, since this is the development environment. It could as well be triggered by someone from their machine by accident. But I will keep an eye out for that

idle solstice
#

The daemon errors went away. My pod is still restarting all the time. Now I get:

Timed out waiting for expected acknowledgement for original message 08dda832-3ce3-71c5-561c-f79d6ccd0000 of type Wolverine.Runtime.Agents.StartAgent
Would you think it is related?

vestal flint
#

Not really, no

#

That just smells like your system being really slow on cold starts

#

And it should "heal" from that

idle solstice
#

So, that last problem was an out of memory of our test pod. Boosting mem a bit solved the problem. Might still be some memory leak in some endpoint, will report back if it is anything Critter related. I noticed they use minimal api and documentsession a lot for endpoints. But I think that should work just fine AFAIK.

app.MapPost("/api/v1/zorgprogrammas", async ([FromBody] ZorgprogrammaOverzichtSearchModel searchModel, session, CancellationToken cancellationToken) => //.. code
idle solstice
#

hmm, I ran a db-patch and I see this is on the bottom of the sql 👀 . Maybe there is something going on with that

#

This is on another project by the way

vestal flint
#

Definitely a problem if that's really happening. Note that I can't think of any possible reason why that would have changed in V8. Any issue w/ permissions in your system? Does that SEQUENCE already exist? Is that the right sequence for the event storage?

idle solstice
#

It is indeed not available somehow

vestal flint
#

That's the wolverine storage though, that might be in a different schema

idle solstice
#

hmm, weird. Something did remove that mt_events_sequence on this database, but also the one I had issues with yesterday

#

If I find what causes it, I will report it. For now I just shared this for if someone else has problems and you will have a pointer

vestal flint
#

I'm not getting any other reports of this so far

wooden apex
#

So finally got around to test the latest updates and they seem a lot better, but I still have issues with the code generation combined with the options.WarmUpRoutes = RouteWarmup.Eager; setting. The code generation no longer fails, but all my http endpoint get generated empty and look like this.

// <auto-generated/>
#pragma warning disable

namespace Internal.Generated.WolverineHandlers
{
}
#

I do see a difference in the output and when its all empty it seems to print this for every line. I'm running on MacOS or Linux so I know Dia2Lib isn't available being Windows only, but I'm not sure why its being referenced during the codegen command.

System.IO.FileNotFoundException: Could not load file or assembly 'Dia2Lib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'. The system cannot find the file specified.

File name: 'Dia2Lib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'
at System.Reflection.RuntimeAssembly.InternalLoad(AssemblyName assemblyName, StackCrawlMark& stackMark, AssemblyLoadContext assemblyLoadContext, RuntimeAssembly requestingAssembly, Boolean throwOnFileNotFound)
at System.Reflection.Assembly.Load(AssemblyName assemblyRef)
at JasperFx.RuntimeCompiler.AssemblyGenerator.ReferenceAssembly(Assembly assembly) in /_/src/JasperFx.RuntimeCompiler/AssemblyGenerator.cs:line 67
Could not make an assembly reference to Microsoft.Diagnostics.Tracing.TraceEvent, Version=3.1.15.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a

vestal flint
#

@wooden apex It's maybe likely that you've got some kind of unusual setup that's spoofing the application assembly finding. Try explicitly configuring the ApplicationAssembly.

#

And the codegen has to do a recursive search of every referenced assembly in order for anything to work. Something in your system has to have a transitive reference to whatever that is

wooden apex
# vestal flint <@160706027765891072> It's maybe likely that you've got some kind of unusual set...

Tried the ApplicationAssembly, but it doesn't makes a different. You pointed me in the correct direction at least so thanks! Some details that might be useful to others. The Dia2Lib is Microsofts Debug Interface Access SDK. Something I think Sentry is pulling for profiling and seems to be some Microsoft mess.

Applied this as temporary solution for now

 if (app.Environment.IsDevelopment() && 
        !args.Contains("codegen"))
    {
        options.WarmUpRoutes = RouteWarmup.Eager;
    }
woeful harbor
#

I'm running into something similar to @wooden apex above during my upgrade. I had to set the ApplicationAssembly in both the CritterStackDefaults options and in the UseWolverine options. If i didn't do the latter the UseFluentValidation would only pick up the Wolverine.FluentValidation as part of the assembly scanning and find no validators.

Without adding the ApplicationAssembly in CritterStackDefaults none of my handlers got discovered in tests. It seemed to assume the entrypoint was the test.dll and not discover the handlers. Not sure if that context helps

vestal flint
#

It does and it doesn't. You can't depend on Assembly.GetEntryAssembly(). What version of Wolverine & JasperFx?

#

And what test harness / IDE do you use?

woeful harbor
#

Wolverine 4.30, JasperFX 1.2.0.

using TUnit as my test framework, Alba to wrap web application factory and VS

vestal flint
#

The FluentValidation thing is a timing issue

#

And w/o correcting the AppAssembly, what does it think it is?

woeful harbor
#

So in UseWolverine application assembly is null and Assemblies is a collection containing a single assembly with: {Wolverine.Http.FluentValidation, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null}

vestal flint
#

Yeah, because right now the fluent validation finding is executed before the JasperFxOptions thing is specified. Timing issue on top of everything else

woeful harbor
#

Yeah that makes sense with the order the breakpoints are hit too