#Copilot Issue or Claude API Limit

1 messages · Page 1 of 1 (latest)

astral ibex
#

I asked my Copilot to generate a comprehensive report about a merchant. To do that, it executed multiple queries and gathered data successfully. It then began generating a summary based on the collected data, but the response stopped midway without completing the summary.

The /v1 endpoint did not return any errors, so I don’t have visibility into what caused the interruption.

How can I debug this issue, and what should I look for to understand why the Copilot’s response stopped halfway?

mental needleBOT
#

This solution has been AI generated

Your Copilot likely stopped due to either token/output limits or a streaming interruption. You can debug by verifying the maxTokens forwarded to the model, checking for stream stop signals in the resolver, and monitoring for silent rate-limit retries.

Likely causes (from the codebase)

  • Token limit truncation:
    • forwardedParameters.maxTokens maps to Anthropic max_tokens (defaults to 1024 if unset). Hitting this will cut responses mid-stream.
    • Input messages may be truncated by limitMessagesToTokenCount() before the call, reducing context and contributing to shorter outputs.
  • Streaming interruptions:
    • Streams can be halted by interruptStreaming$, stopStreamingText, or stopStreamingMessages in the GraphQL resolver.
  • Rate limits/retries:
    • 429s are retried with backoff; if limits are hit repeatedly, you may observe partial or halted outputs without a surfaced error.

How to debug it quickly

  • Verify/increase output tokens:
    • Ensure you pass a higher forwardedParameters.maxTokens for long summaries (e.g., increase beyond 1024) so Anthropic doesn’t truncate the response.
  • Add targeted logging:
    • In anthropic-adapter.ts, log the max_tokens sent and streaming lifecycle.
    • In the resolver, log when interruptStreaming$, stopStreamingText, or stopStreamingMessages fire to catch premature stops.
  • Check input truncation:
    • Inspect limitMessagesToTokenCount() behavior to confirm conversation/history isn’t being pruned too aggressively.
  • Monitor network/rate limits:
    • Watch for 429s or stream disconnects; repeated retries can mask the root cause.
#

Source references

  • Anthropic adapter (max_tokens, streaming): packages/runtime/src/service-adapters/anthropic/anthropic-adapter.ts
  • Token limiting utils: packages/runtime/src/service-adapters/anthropic/utils.ts
  • Resolver streaming/interrupts: packages/runtime/src/graphql/resolvers/copilot.resolver.ts
  • Forwarded parameters (includes maxTokens): packages/runtime/src/graphql/inputs/forwarded-parameters.input.ts
  • Retry and rate limits: packages/runtime/src/lib/runtime/retry-utils.ts, packages/runtime/src/lib/error-messages.ts

Was this helpful?

If this solution worked for you, please click on the appropriate option below to help us improve:

untold harness
#

Hi @astral ibex, Just checking in - did that answer your question?