#Issue with free ai model output isn't consistent

7 messages · Page 1 of 1 (latest)

slender wyvern
#

Hi,

I have an agent that use deepseek:deepseek-chat-v3.1:free ai model in n8n, this agent doing looping for 6 times. From 6 items that the agent receive it run successfully and predictably for 4 runs while the other 2 is outputting unexpected return.

I attach the activity of the agent run, on the third it only outputting 31 output and the forth outputting 4098 token, the third run basically return nothing only json structure without any input data being process while on the forth run the agent returning (if I'm correct) this words "categorize the following news articles" over and over again.

I also found that most of the time when I tried to run the agent it always said that I hit the rate limit even though I still haven't pass the daily limit and also my total request is less than 20 request / min.

I want to ask does the free ai model isn't reliable when run (due to always returning hit limit) and unpredictable due to returning anomaly output in loop environment?

vivid ridgeBOT
#

I'm an AI assistant - my answers may be incorrect and I cannot see screenshots. Please verify information with the OpenRouter docs.

To help you and the support team, I need a bit more detail:

  • What is the exact error message you get for the "hit rate limit" response?
  • What input (prompt/messages/payload) did you send to the agent on the runs with unexpected output?
  • Can you share the n8n workflow or relevant code snippet of how you call the model in your loop?

Without this, I can't diagnose precisely—but here’s what's relevant about the free model:

  • "OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime" and “recent uptime” for providers is typically high, but not 100%. Free variants can be more rate-limited or less prioritized, which can cause 429/rate-limit errors even before hitting stated quotas DeepSeek V3.1 (free) - API, Providers, Stats.
  • "DeepSeek-V3.1 (free)" supports long context, but performance and reliability can vary by provider and load. Repetitive or malformed outputs may be due to input formatting, hitting output length limits, or model instability under load Performance Metrics.
  • The OpenRouter API docs recommend monitoring for retry-able errors like rate limits with free models and suggest using paid or alternative providers for production workloads.

Please provide the details above—mod team can better assist if escalation is needed.

-# This is an automated response powered by Answer HQ. Use !mute to stop the bot from responding.

The author you are looking for could not be found.

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. Run DeepSeek V3.1 with API

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. Users can control the reasoning behav...

drowsy crescent
#

!mute

vivid ridgeBOT
#

🔇 Thread muted. The bot will no longer respond here.

drowsy crescent
# slender wyvern Hi, I have an agent that use deepseek:deepseek-chat-v3.1:free ai model in n8n,...

why aren't model outputs consistent

the answer is because the model outputs are guided by random number generators. Set the temperature to 0 to get more consistent outputs (it's not garrenteed to be the same tho)

The technical explaination is that the model doesn't choose a word. It chooses a list of words that would all work. So if the input was the cat sat on the then the output could be mat or floor or chair and they would all make sense. A word from that list is chosen to be the predicted word and the output of the cat sat on the floor is the model's new input

#

why aren't the free models reliable

So many people are using them that OpenRouter hits rate limits. Chutes only gives OpenRouter so much capacity

Use GLM 4.5 air (free) for less rate limit errors

slender wyvern
#

I understand on how the LLM behave what I don't understand that the agent is looping the same structure data but behave differently, what I means by "same structured data" is that my agent handle inputs with this structure:
{
cluster: {
{
title: ...,
description: ...
}, ...
},
cluster: {
...
}
}

it looping that structure and categorize that title and description to specific category. The data is the same but why the output can be so different, is this problem can be related on how chutes limiting the capacity or because the model itself or other limitation that makes it behave like that.