#Hey, I’m running Permify on Kubernetes

1 messages · Page 1 of 1 (latest)

rare remnant
mossy nebula
#

Hi, thank you for the response. I opened that issue, so I’ll wait for your solution.

livid oar
#

Hey @mossy nebula, I saw your reply on github you tried headless service and it’s still not working.
Can you share your service and deployment yamls so i can look deeper.

mossy nebula
livid oar
#

Hello @mossy nebula , your headless Service doesn’t have a namespace set, so it was created in the default namespace. Can you create it in permify and retry?

mossy nebula
#

hi, @livid oar sorry forgot to mention if it worth. I'm deploying it with ArgoCD on permify namespace, so the actual test was in permify namespace

livid oar
#

I get it @mossy nebula . In that case, the configuration looks correct.Can you verify that the headless service is actually pointing to the pod IPs?:
kubectl describe svc permify-headless -n permify
Also please check pod config to make sure Argo applied it correctly.

And can you share the error messages you are seeing?

mossy nebula
#

hey @livid oar, I checked again and the headless-service pointing to correct pods ips, also the configuration applied in the cluster.

attaching the full log, the main errors are:

time=2025-11-27T10:01:58.153Z level=DEBUG msg="A context-related error occurred" error="context canceled"
time=2025-11-27T10:01:58.153Z level=ERROR msg=ERROR_CODE_CANCELLED
time=2025-11-27T10:01:58.153Z level=ERROR msg="finished call" protocol=grpc grpc.component=server grpc.service=base.v1.Permission grpc.method=Check grpc.method_type=unary peer.address=<ip-2>:36676 grpc.start_time=2025-11-27T10:01:58Z grpc.request.deadline=2025-11-27T10:02:02Z grpc.code=Internal grpc.error="rpc error: code = Internal desc = ERROR_CODE_CANCELLED" grpc.time_ms=8.445
time=2025-11-27T10:01:58.189Z level=ERROR msg="rpc error: code = Canceled desc = context canceled"
time=2025-11-27T10:01:58.193Z level=ERROR msg="rpc error: code = Canceled desc = context canceled"
time=2025-11-27T10:01:58.193Z level=DEBUG msg="A context-related error occurred" error="context canceled"

livid oar
#

Thanks for the logs @mossy nebula . I took a look, and the good news is there is no more 4s timeout. They are failing almost instantly with that cancellation error, ERROR_CODE_CANCELLED, which can mean client or a proxy is cutting the connection immediately.

Forwarding is actually working fine for the most part, but it seems to drop specifically during those high-traffic bursts. But I can't reproduce this errors on my side.

Could you double-check your client timeouts? Also, if you can upgrade the Permify to v1.5.2 and and run k6 test from your side it would be very helpful to understand if its client or load issue.

mossy nebula
#

Yes ill check if the timeouts are from the client side or there is some issue on how we implement the schemas.

Those logs are from our staging environment with very low traffic, so i'm not sure it related to traffic bursts.

I'll also try to upgrade.

Thank you.

livid oar
#

Thanks, please keep us updated.