Running a very large nestjs app and use Datadog intrgration to help find bottlenecks in the code. I have an endpoint that shows it took 25 seconds to return and it happens randomly thought the day.
Code execution in the flamegraph shows it ran in under 2 seconds, but then it hung in the web request for the other 24 seconds.
I'm using Aws ecs tasks that run through a load balancer as well.
Any thoughts on what I can a look at? Seems to be the main endpoint this happens in.