#What is your method for multi-account rotation training? How to achieve

1 messages · Page 1 of 1 (latest)

heavy island
#

I tested leaving the API keys blank in the .env file, with the following config.yaml:

model:
default: gemini-flash-lite-latest # 默认模型:使用 Gemini 3.1 Flash Lite 的最新滚动版本
provider: google # 提供商:指定使用 Google 官方驱动
base_url: https://generativelanguage.googleapis.com/v1beta # API 地址:使用支持最新模型的 v1beta 接口
credential_pool: # 凭据池:定义一组可以轮换使用的 API Key
gemini: # 针对 gemini 服务的 Key 池
- provider: google # 第 1 个凭据源:指定提供商为 Google
api_key: GEMINI_API_KEY_1 # 第 1 个 API Key 内容
- provider: google # 第 2 个凭据源
api_key: GEMINI_API_KEY_2 # 第 2 个 API Key 内容
- provider: google # 第 3 个凭据源
api_key: GEMINI_API_KEY_3 # 第 3 个 API Key 内容
credential_pool_strategies: # 凭据池策略:定义如何切换上述 Key
gemini: fill_first # 策略方案:优先用死第一个 Key,直到触发 429 限流报错才切换到下一个
providers: {} # 特定提供商配置:目前为空,代表使用全局默认值
fallback_providers: # 备用方案:当凭据池里的所有 Key 都失效时启用的“救命稻草”
- provider: openrouter # 备选 1:通过 OpenRouter 调用
model: z-ai/glm-4.5-air:free # 模型 1:智谱 GLM-4.5 Air 免费版
- provider: openrouter # 备选 2
model: openai/gpt-oss-120b:free # 模型 2:OpenAI 120B 开源免费版
- provider: openrouter # 备选 3
model: openrouter/free # 模型 3:OpenRouter 随机分配的免费模型

However, it resulted in an error during execution. I then tried setting the API keys in the .env file as follows:

.env

GEMINI_API_KEY_1=your_api_key_1_here
GEMINI_API_KEY_2=your_api_key_2_here
GEMINI_API_KEY_3=your_api_key_3_here

But it still failed.


What is your method for multi-account rotation training? How to achieve

tight kettleBOT
# heavy island I tested leaving the API keys blank in the .env file, with the following config....

I tested leaving the API keys blank in the .env file, with the following config.yaml:

model:
default: gemini-flash-lite-latest # Default model: Use the latest rolling version of Gemini 3.1 Flash Lite
provider: google # Provider: Specify the use of Google official driver
base_url: https://generativelanguage.googleapis.com/v1beta # API address: Use the v1beta interface that supports the latest model
credential_pool: # Credential pool: defines a set of API Keys that can be rotated
gemini: # Key pool for gemini service
- provider: google # 1st credential source: Specified provider is Google
api_key: GEMINI_API_KEY_1 # 1st API Key content
- provider: google # 2nd credential source
api_key: GEMINI_API_KEY_2 # The second API Key content
- provider: google # 3rd credential source
api_key: GEMINI_API_KEY_3 #3rd API Key content
credential_pool_strategies: # Credential pool strategies: define how to switch the above Key
gemini: fill_first #Strategy plan: use the first Key first, and do not switch to the next one until a 429 current limiting error is triggered.
providers: {} # Specific provider configuration: currently empty, indicating the use of global defaults
fallback_providers: # Alternate solution: "Life-saving straw" enabled when all keys in the credential pool expire
- provider: openrouter # Alternative 1: Call via OpenRouter
model: z-ai/glm-4.5-air:free # Model 1: Zhipu GLM-4.5 Air free version
- provider: openrouter # alternative 2
model: openai/gpt-oss-120b:free # Model 2: OpenAI 120B open source free version
- provider: openrouter # alternative 3
model: openrouter/free # Model 3: OpenRouter randomly assigned free model

However, it resulted in an error during execution. I then tried setting the API keys in the .env file as follows:

.env

GEMINI_API_KEY_1=your_api_key_1_here
GEMINI_API_KEY_2=your_api_key_2_here
GEMINI_API_KEY_3=your_api_key_3_here

But it still

#

failed.


What is your method for multi-account rotation training? How to achieve

vale crater
#

The problem is not “multi-account rotation doesn’t work.” The problem is that Hermes is not reading the pool from config.yaml the way you wrote it.

For Gemini, the pool Hermes actually rotates comes from the auth store, not from a handwritten nested credential_pool block in config.yaml.

Also, Gemini env auto-detection only looks for:
GOOGLE_API_KEY
GEMINI_API_KEY

It does not auto-load numbered vars like:
GEMINI_API_KEY_1
GEMINI_API_KEY_2
GEMINI_API_KEY_3

So the working setup is:

Set the model/provider normally:
hermes config set model.provider gemini
hermes config set model.default gemini-3.1-flash-lite-preview

Set the pool strategy:
hermes config set credential_pool_strategies.gemini fill_first

Then add each Gemini key into the real Hermes credential pool:
hermes auth add gemini
hermes auth add gemini
hermes auth add gemini

Paste one API key each time when prompted.

Then verify Hermes sees the pool:
hermes auth list gemini

If that is set up correctly, you should see multiple gemini credentials listed there, and that is what Hermes rotates.

So the practical fix is: do not try to define the Gemini pool manually inside config.yaml. Use hermes auth add gemini to create the pool entries.

Your pasted sample also looks malformed in a couple of places, especially the pool block and the fallback block, so if it still fails after doing the steps above, paste these three things:

hermes auth list gemini

the model: and credential_pool_strategies: sections from ~/.hermes/config.yaml

the exact error from ~/.hermes/logs/agent.log

With that, we can tell you whether it’s just config shape, missing pool entries, or a separate runtime issue.

heavy island
#

那如果我要添加的是google ai studio也依然是输入hermes auth add gemini吗?

heavy island
#

所以我是要在hermes中添加多个google ai studio的多个api key,并且用完第一个token或使用失败则使用下一个api key的用法是怎么样的?

vale crater
#

是的,如果你添加的是 Google AI Studio 的 API key,Hermes 里对应的 provider 就是 gemini,所以命令还是:

hermes auth add gemini --label aistudio-1
hermes auth add gemini --label aistudio-2
hermes auth add gemini --label aistudio-3

每次粘贴一个不同的 Google AI Studio API key。

然后设置轮换策略:

hermes config set credential_pool_strategies.gemini fill_first

再确认 Hermes 真的看到了多个 key:

hermes auth list gemini

如果你想用 Google AI Studio 作为主模型,再用:

hermes model

然后选择 Google AI Studio / Gemini 下面的模型。不要手写 gemini-flash-lite-latest 这种不确定的模型名,最好从 hermes model 里选,避免模型名不匹配。

fill_first 的行为是:优先一直用第一个可用 key。第一个 key 遇到限流、额度耗尽、认证失败这类错误后,Hermes 会把它标记为暂时不可用,然后换下一个 key。

注意两点:

GEMINI_API_KEY_1、GEMINI_API_KEY_2、GEMINI_API_KEY_3 这种编号环境变量不会被 Hermes 自动读取。

google-gemini-cli 是另一个 OAuth / Gemini CLI 登录方式,不是 Google AI Studio API key。你这种多 API key 轮换场景用 gemini。

所以你的目标配置应该是:

多个 Google AI Studio key:用 hermes auth add gemini 添加
轮换策略:credential_pool_strategies.gemini fill_first
模型选择:用 hermes model 选择 Gemini 模型

先不要手写 credential_pool 或 fallback_providers 那些 YAML。先把 Gemini 多 key 池跑通,再处理 fallback。

heavy island
#

GNU nano 8.4 config.yaml
model:
default: gemini-flash-lite-latest
provider: gemini
base_url: https://generativelanguage.googleapis.com/v1beta
providers: {}
fallback_providers:

  • provider: openrouter
    model: z-ai/glm-4.5-air:free
  • provider: openrouter
    model: openai/gpt-oss-120b:free
  • provider: openrouter
    model: openrouter/free
    toolsets:
  • hermes-cli
    agent:
    max_turns: 150
    gateway_timeout: 1800
    restart_drain_timeout: 60
    service_tier: ''
    tool_use_enforcement: auto
    gateway_timeout_warning: 900
    gateway_notify_interval: 600
    verbose: false

logs:

heavy island
#

我先设置hermes model 是 google ai studio 的gemini-flash-lite-latest,模型

然后执行了:

hermes auth add gemini
hermes auth add gemini

输入 hermes auth list gemini

输出:

gemini (3 credentials):
#1 GOOGLE_API_KEY api_key env:GOOGLE_API_KEY ←
#2 aistudio-2 api_key manual
#3 aistudio-3 api_key manual

在执行了

制定资金池策略:
hermes config set credential_pool_strategies.gemini fill_first

我该如何验证,他是按照我我的方案执行了呢?就是一个用完用另外一个

#

添加模型有genmini和google ai studio两种方式,一种是网页版本的auth token登录的模式,一种是api-key的模式,都是可以用hermes auth add gemini方式添加?

vale crater
#

你现在这个输出是对的:

gemini (3 credentials):
#1 GOOGLE_API_KEY api_key env:GOOGLE_API_KEY ←
#2 aistudio-2 api_key manual
#3 aistudio-3 api_key manual

这里的 表示 Hermes 当前会先选用 GOOGLE_API_KEY

你设置了:

hermes config set credential_pool_strategies.gemini fill_first

以后,fill_first 的行为就是:一直使用第一个可用 key,直到这个 key 因为 429 限流、额度耗尽、认证失败等错误被 Hermes 标记为 exhausted,然后才切到下一个可用 key。

先更新 Hermes,确保你测试的是当前 main:

hermes update

如果你是通过 gateway / Telegram / Discord 测试,更新和设置策略以后重启 gateway:

hermes gateway restart

验证方式主要看 agent.log,不是看普通回复内容。真正发生轮换时,日志里应该出现类似:

credential pool: marking GOOGLE_API_KEY exhausted (status=429), rotating
credential pool: rotated to aistudio-2

然后你再运行:

hermes auth list gemini

这时 应该会移动到 aistudio-2,或者第一个 key 会显示 exhausted / 冷却状态。

另外,你之前的日志里是 cron 任务在触发请求。测试多 key 轮换时最好先暂停 cron,不然后台任务会继续消耗这些 key,导致你很难判断到底是哪一次请求触发了轮换。

查看 cron:

hermes cron list

暂停任务:

hermes cron pause <job_id>

不建议为了测试故意把 key 用爆。只要以后遇到真实 429 时,agent.log 里出现 marking ... exhaustedrotated to ...,就说明轮换按你的方案执行了。

heavy island
#

什么时候该操作hermes update?我会执行hermes update,但我在想,我只是设置个多个API-KEY为什么要执行hermes update?

#

我通过telegram测试输出:⚠️ Response truncated due to output length limit,显然没有奏效

#

这个是bug吗?

┊ 💻 preparing terminal…
┊ 💻 $ echo "Current Model: $MODEL_NAME"

Note: As an AI agent, I don't have a direct tool to query the live API billing/token quota dashboard

in the way a user dashboard would. I operate until I hit a rate/quota limit,

at which point I receive a specific error and use the established retry/wait logic.

I will check if there's any local config file I should be aware of.

ls -F ~/.hermes/config.yaml
0.6s
⚠️ API call failed (attempt 1/3): GeminiAPIError [HTTP 400]
🔌 Provider: gemini Model: gemini-flash-lite-latest
🌐 Endpoint: https://generativelanguage.googleapis.com/v1beta
📝 Error: HTTP 400: Gemini HTTP 400 (INVALID_ARGUMENT): Function call is missing a thought_signature in functionCall parts. This is required for tools to work correctly, and missing thought_signature may lead to degraded model performance. Additional data, function call default_api:terminal , position 2. Please refer to https://ai.google.dev/gemini-api/docs/thought-signatures for more details.
⚠️ Non-retryable error (HTTP 400) — trying fallback...
🔄 Primary model failed — switching to fallback: z-ai/glm-4.5-air:free via openrouter
⚠️ API call failed (attempt 1/3): RateLimitError [HTTP 429]
🔌 Provider: openrouter Model: z-ai/glm-4.5-air:free
🌐 Endpoint: https://openrouter.ai/api/v1/
📝 Error: HTTP 429: Provider returned error
📋 Details: {'message': 'Provider returned error', 'code': 429, 'metadata': {'raw': 'z-ai/glm-4.5-air:free is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits: https://openrouter.ai/settings/integrations', 'provider_name': 'Z.AI', 'is_byok': False}}
⏱️ Rate limited. Waiting 2.5s (attempt 2/3)...

The unified interface for LLMs. Find the best models & prices for your prompts

OpenRouter

Sign in to your OpenRouter account.

vale crater
#

Google AI Studio 的 API key 模式,是的,对应 provider 是 gemini,所以命令是:

hermes auth add gemini

但是网页 OAuth / 浏览器登录模式不是同一个 provider。那个是:

hermes auth add google-gemini-cli

所以这里要分开看:

Google AI Studio API key 模式:
hermes auth add gemini

Google Gemini OAuth / 浏览器登录模式:
hermes auth add google-gemini-cli

不要把这两种方式当成同一个凭据池来混用。credential pool 是在同一个 provider 内部轮换凭据。你现在这个 3 个 key 的轮换方案,是 gemini provider,也就是 Google AI Studio API key 的轮换。

Response truncated due to output length limit 不能说明 key 轮换失败。这个是另一个问题:模型生成了不完整的 tool call,所以 Hermes 拒绝执行。

你后面贴的这个错误也是另一个问题,不是 key 轮换问题:

Gemini HTTP 400 (INVALID_ARGUMENT): Function call is missing a thought_signature

这是 Gemini 工具调用协议错误。它不是“第一个 key 用完了,应该切到下一个 key”。HTTP 400 invalid request 不会被当成额度耗尽来轮换 key。

真正发生 key 轮换时,你应该在日志里看到 429 / quota / auth 相关错误,然后出现类似:

credential pool: marking ... exhausted
credential pool: rotated to ...

对于这个 thought_signature 400,更新 Hermes 是有意义的,因为 current main 里有 Gemini tool call / thought signature 保存相关的修复。先更新:

hermes update

如果你是通过 Telegram / gateway 测试,更新后重启 gateway:

hermes gateway restart

然后再测试一次。如果更新到 current main 以后还是出现同样的 thought_signature 400,请重新发一份更新后的 debug report:

hermes debug share

同时贴这三个输出:

hermes -V
hermes auth list gemini
hermes config show

另外,你的 OpenRouter fallback 也在 429,所以 Gemini 失败以后 fallback 也可能失败。这个和 Gemini 多 key 轮换是两个不同问题。

tight kettleBOT
# vale crater Google AI Studio 的 API key 模式,是的,对应 provider 是 `gemini`,所以命令是: hermes auth add ...

Google AI Studio's API key mode, yes, the corresponding provider is gemini, so the command is:

hermes auth add gemini

But the web OAuth / browser login mode is not the same provider. That one is:

hermes auth add google-gemini-cli

So here we have to look at them separately:

Google AI Studio API key mode:
hermes auth add gemini

Google Gemini OAuth / browser login mode:
hermes auth add google-gemini-cli

Do not mix these two methods as if they were the same credential pool. Credential pool rotates credentials within the same provider. Your current three-key rotation plan is the rotation of the gemini provider, which is the Google AI Studio API key.

Response truncated due to output length limit cannot indicate key rotation failure. This is another problem: the model generates an incomplete tool call, so Hermes refuses to execute it.

The error you posted later is also another problem, not a key rotation problem:

Gemini HTTP 400 (INVALID_ARGUMENT): Function call is missing a thought_signature

This is a Gemini tool calling protocol error. It's not "the first key is used up, you should switch to the next key". HTTP 400 invalid request will not be regarded as quota exhaustion for key rotation.

When key rotation actually occurs, you should see 429 / quota / auth related errors in the log, and then something similar:

credential pool: marking ... exhausted
credential pool: rotated to ...

For this thought_signature 400, it makes sense to update Hermes since there are Gemini tool call / thought signature saving related fixes in current main. Update first:

hermes update

If you are testing via Telegram/gateway, restart the gateway after updating:

hermes gateway restart

Then test again. If the same thought_signature 400 still appears after updating to current main, please resend an updated debug report:

hermes debug share

Paste these three outputs at the same time:

hermes-V
hermes auth list gemini
hermes config show

In addition, your OpenRouter

#

fallback is also at 429, so the fallback may also fail after Gemini fails. This and Gemini multi-key rotation are two different issues.

heavy island
vale crater
#

这段新日志还是没有包含真正的错误行。

它能说明的是:update / restart 已经发生了,gateway 也正常重新起来了。后面的日志里 Hermes 已经在使用:

gemini-3.1-flash-lite-preview

所以这部分看起来已经更好了。

但现在最关键的是这一行:

Fallback activated: gemini-3.1-flash-lite-preview → z-ai/glm-4.5-air:free (openrouter)

这说明 Gemini 失败了,然后 Hermes 切到了 fallback。问题是你贴的这一段没有包含 fallback 前面紧挨着的 Gemini 错误,所以现在还不能判断剩下的问题是 Gemini 400、429、tool-call 问题,还是别的错误。

请贴这个时间点前面的日志:

2026-04-23 11:49:53

我需要看到这行之前第一个 WARNING / ERROR

Fallback activated: gemini-3.1-flash-lite-preview → z-ai/glm-4.5-air:free (openrouter)

如果想最简单,更新和重启之后重新跑一次 debug share,把链接发出来:

hermes debug share

从你这段日志里现在能确认的是:

Hermes 已经 update / restart
Telegram 正常连接
模型已经从 gemini-flash-lite-latest 换成 gemini-3.1-flash-lite-preview
这段日志没有显示 key rotation
这段日志显示发生了 fallback
真正的 Gemini 失败原因还缺失

所以现在下一步不是继续贴 config 截图,而是贴 fallback 前面那一条真正的错误行。

tight kettleBOT
# vale crater 这段新日志还是没有包含真正的错误行。 它能说明的是:update / restart 已经发生了,gateway 也正常重新起来了。后面的日志里 Hermes...

This new log still does not contain the actual error line.

What it can explain is: update / restart has occurred, and the gateway has restarted normally. In the following log, Hermes is already in use:

gemini-3.1-flash-lite-preview

So this part already looks better.

But the key now is this line:

Fallback activated: gemini-3.1-flash-lite-preview → z-ai/glm-4.5-air:free (openrouter)

This shows Gemini failed and then Hermes cuts to the fallback. The problem is that the paragraph you posted does not include the Gemini error immediately before the fallback, so it is not yet possible to determine whether the remaining problems are Gemini 400, 429, tool-call problems, or other errors.

Please post the logs before this point in time:

2026-04-23 11:49:53

I need to see this line before the first WARNING / ERROR:

Fallback activated: gemini-3.1-flash-lite-preview → z-ai/glm-4.5-air:free (openrouter)

If you want the easiest thing, run the debug share again after updating and restarting, and send the link:

hermes debug share

What I can confirm from your log is:

Hermes has been updated/restarted
Telegram normal connection
The model has been changed from gemini-flash-lite-latest to gemini-3.1-flash-lite-preview
This log does not show key rotation
This log shows that fallback occurred
The real reason for Gemini’s failure is still missing

So now the next step is not to continue to post the config screenshot, but to post the real error line before the fallback.

heavy island
#

hermes debug share是自动上传bug错误给你们的支持团队?

tight kettleBOT
vale crater
#

不是。hermes debug share 不会自动把 bug 或日志发送给支持团队。它只是把 debug report 和日志上传到一个临时公开 paste 服务,然后给你链接。你还是需要把这些链接发到这里,支持人员才能查看。

我看了你新的链接。你现在的 Gemini API-key pool 配置形状已经是对的:

provider: gemini
model: gemini-3.1-flash-lite-preview
hermes auth list gemini 里有 3 个 Gemini 凭据
GOOGLE_API_KEY 上,表示 Hermes 会先用这个 key
Telegram gateway 正在运行
OpenRouter fallback 也已经配置上了

日志里也能看到 key rotation 之前确实工作过。大约在 2026-04-23 02:01,Hermes 把一个 Gemini key 标记为 exhausted,原因是 status=429,然后切到了另一个 key;后面另一个 key 也被标记为 exhausted。这就是正常的轮换行为。

在你切换到 gemini-3.1-flash-lite-preview 之后的最新日志里,我没有看到之前那个 thought_signature 400 错误。也没有看到新的 Gemini key 轮换失败。现在只看到 Gemini 失败后进入了 fallback。

现在更大的问题是,你的 fallback 模型都是 OpenRouter 免费模型:

z-ai/glm-4.5-air:free
openai/gpt-oss-120b:free
openrouter/free

这些限制很低。OpenRouter 文档里写的是:如果账号购买的 credits 少于 10,免费模型每天只有 50 次请求限制。如果购买至少 10 credits,免费模型每天限制会提高到 1000 次请求,但 20 requests/minute 的限制仍然存在,而且上游 provider 仍然可能对某个免费模型单独限流。

OpenRouter rate limit 文档:
https://openrouter.ai/docs/api-reference/limits/

OpenRouter pricing / free model limit:
https://openrouter.ai/pricing

你的 debug report 里还显示现在有 8 active / 8 total 个 cron jobs。这些 cron jobs 会很快消耗 Gemini quota,也会很快消耗 OpenRouter 免费 fallback。OpenRouter 免费模型如果只有每天 50 次请求,一个比较重的 cron 任务可能一次就用掉大部分甚至全部 fallback 容量。

测试的时候,先暂停 cron:

hermes cron list

然后暂停测试期间不想运行的任务:

hermes cron pause <job_id>

暂停后,再从 Telegram 测试一次。如果再次失败,请在失败后立刻运行 hermes debug share,把新的链接和失败的大概时间发出来。

真正有用的是 fallback 前面的具体错误,或者这种轮换日志:

credential pool: marking ... exhausted
credential pool: rotated to ...

现在看起来你的 Gemini pool 配置是对的。日志更像是 Gemini 和 OpenRouter 免费 fallback 都遇到了 quota / rate limit,而不是多 key pool 本身坏了。

OpenRouter Documentation

Learn about OpenRouter's API rate limits, credit-based quotas, and DDoS protection. Configure and monitor your model usage limits effectively.

OpenRouter

Transparent pricing for OpenRouter. Pay only for what you use with access to 300+ AI models. Free tier, Pay-as-you-go, and Enterprise plans available.

tight kettleBOT
# vale crater 不是。`hermes debug share` 不会自动把 bug 或日志发送给支持团队。它只是把 debug report 和日志上传到一个临时公开 past...

no. hermes debug share does not automatically send bugs or logs to the support team. It just uploads the debug report and logs to a temporary public paste service and gives you the link. You still need to send these links here so support can view them.

I saw your new link. Your current Gemini API-key pool configuration shape is already correct:

provider:gemini
model:gemini-3.1-flash-lite-preview
There are 3 Gemini credentials in hermes auth list gemini
is on GOOGLE_API_KEY, indicating that Hermes will use this key first
Telegram gateway is running
OpenRouter fallback has also been configured

You can also see in the log that key rotation did work before. Around 2026-04-23 02:01, Hermes marked a Gemini key as exhausted because of status=429, and then switched to another key; the other key was also marked as exhausted. This is normal rotation behavior.

In the latest logs after you switched to gemini-3.1-flash-lite-preview, I don't see the previous thought_signature 400 error. I haven't seen the new Gemini key rotation fail either. Now I only see Gemini going into fallback after failing.

Now the bigger problem is that your fallback models are all OpenRouter free models:

z-ai/glm-4.5-air:free
openai/gpt-oss-120b:free
openrouter/free

These limits are low. The OpenRouter documentation states: If the account purchases less than 10 credits, the free model is limited to 50 requests per day. If you purchase at least 10 credits, the daily limit of the free model will be increased to 1,000 requests, but the 20 requests/minute limit will still exist, and the upstream provider may still limit the flow of a free model individually.

OpenRouter rate limit documentation:
https://openrouter.ai/docs/api-reference/limits/

OpenRouter pricing / free model limit:
https://openrouter.ai/pricing

Your debug report also shows that there are 8 active / 8 total cron jobs. These cron jobs will quickly consume the Gemini quota, and will also quickly consume

OpenRouter Documentation

Learn about OpenRouter's API rate limits, credit-based quotas, and DDoS protection. Configure and monitor your model usage limits effectively.

OpenRouter

Transparent pricing for OpenRouter. Pay only for what you use with access to 300+ AI models. Free tier, Pay-as-you-go, and Enterprise plans available.

#

the OpenRouter free fallback. If the OpenRouter free model only has 50 requests per day, a relatively heavy cron job may use up most or all of the fallback capacity at once.

When testing, pause cron first:

hermes cron list

Then pause the tasks you don't want to run during the test:

hermes cron pause <job_id>

After pausing, test again from Telegram. If it fails again, please run hermes debug share immediately after the failure and send the new link and the approximate time of failure.

What is really useful is a specific error before the fallback, or this rotating log:

credential pool: marking ... exhausted
credential pool: rotated to ...

Now it looks like your Gemini pool configuration is correct. The logs are more likely that both Gemini and OpenRouter free fallbacks have encountered quota/rate limits, rather than that the multi-key pool itself is broken.

heavy island
#

我的cron list执行时间都还很后面,应该不会影响我测试把

tight kettleBOT
heavy island
#

因为我配置了多个google ai studio api key,我总共有3个,其中2个根本还没怎么用过,不太会出现配额的问题吧

tight kettleBOT
vale crater
#

是的,如果 cron 的下次执行时间还在后面,那它不应该影响你现在这一次手动测试。我刚才提 cron,是因为你之前的日志里确实看到 02:00 左右 cron jobs 跑起来并触发了 429。所以它不一定影响你现在这一刻的测试,但等定时任务真正运行时,它还是会消耗 quota。

关于 3 个 Google AI Studio API key:如果另外 2 个 key 真的有独立可用额度,那第一个 key 遇到真实的 429 / quota / auth 失败后,它们应该可以被用上。

但这里有一个重要细节:多个 API key 不一定等于三个完全独立的 quota pool。如果这些 key 属于同一个 Google project / 同一个 quota bucket,它们可能共享额度。所以“这个 key 本身没怎么用过”不一定保证它还有独立额度。

你现在的策略是:

credential_pool_strategies.gemini = fill_first

所以 Hermes 会一直先用 GOOGLE_API_KEY,因为箭头在这里:

#1 GOOGLE_API_KEY ... ←

它不会因为还有 aistudio-2 就主动切过去。只有当前 key 遇到 Hermes 认为是凭据耗尽的错误时,比如 429 / quota / auth 失败,它才会切到下一个。

真正发生轮换时,日志里会看到:

credential pool: marking ... exhausted
credential pool: rotated to ...

这些不代表 key 轮换失败:

Response truncated due to output length limit
HTTP 400 thought_signature
Fallback activated ...

HTTP 400 thought_signature 是 Gemini 工具调用协议错误,不是 quota 错误,所以 Hermes 不应该因为这个去轮换 API key。换另一个 API key 也不会修复这种 request error。

如果你要做一个干净测试,先从 Telegram 发一个不需要工具的简单消息,比如:

测试:只回复两个字:正常,不要调用任何工具

如果这个成功,说明 Gemini 模型和 key 的基础调用路径是通的。

如果它再次失败或者进入 fallback,请在失败后立刻运行:

hermes debug share

然后把新的链接和失败的大概时间发出来。我们需要看的,是这行前面的真正错误:

Fallback activated: gemini-3.1-flash-lite-preview → ...

另外,你的 OpenRouter fallback 用的是免费模型。OpenRouter 免费模型限制很低:如果账号购买的 credits 少于 10,免费模型每天只有 50 次请求。如果购买至少 10 credits,免费模型每天限制会提高到 1000 次请求。所以 OpenRouter fallback 也很容易很快被限流。

OpenRouter limits:
https://openrouter.ai/docs/api-reference/limits/

OpenRouter Documentation

Learn about OpenRouter's API rate limits, credit-based quotas, and DDoS protection. Configure and monitor your model usage limits effectively.

tight kettleBOT
# vale crater 是的,如果 cron 的下次执行时间还在后面,那它不应该影响你现在这一次手动测试。我刚才提 cron,是因为你之前的日志里确实看到 02:00 左右 cron ...

Yes, if cron's next execution time is still in the future, it should not affect your manual testing now. I mentioned cron just now because you did see in your previous log that cron jobs started running around 02:00 and triggered 429. So it may not necessarily affect your current test, but when the scheduled task actually runs, it will still consume quota.

Regarding the 3 Google AI Studio API keys: If the other 2 keys really have independent available quotas, they should be able to be used after the first key encounters a real 429/quota/auth failure.

But here's an important detail: multiple API keys don't necessarily equal three completely independent quota pools. If these keys belong to the same Google project/the same quota bucket, they may share the quota. Therefore, "this key itself has not been used much" does not necessarily guarantee that it still has an independent quota.

Your current strategy is:

credential_pool_strategies.gemini = fill_first

So Hermes will always use GOOGLE_API_KEY first, because the arrow is here:

#1 GOOGLE_API_KEY ... ←

It will not automatically switch to it just because there is aistudio-2. Only when the current key encounters an error that Hermes considers to be credential exhaustion, such as 429/quota/auth failure, will it switch to the next one.

When the rotation actually occurs, you will see this in the log:

credential pool: marking ... exhausted
credential pool: rotated to ...

These do not mean key rotation failed:

Response truncated due to output length limit
HTTP 400 thought_signature
Fallback activated...

HTTP 400 thought_signature is a Gemini tool calling protocol error, not a quota error, so Hermes should not rotate API keys because of this. Changing another API key will not fix this request error.

If you want to do a clean test, start by sending a simple message from Telegram that requires no tools, such as:

Test: only reply two words: normal, do not call any tools

If this is successful,

#

it means that the basic calling paths of the Gemini model and key are consistent.

If it fails again or goes into fallback, run it immediately after the failure:

hermes debug share

Then send out the new link and the approximate time of failure. What we need to look at is the real error before this line:

Fallback activated: gemini-3.1-flash-lite-preview → ...

Also, your OpenRouter fallback is using the free model. The OpenRouter free model has very low limits: if the account purchases less than 10 credits, the free model is limited to 50 requests per day. The free model limit increases to 1000 requests per day if you purchase at least 10 credits. Therefore, OpenRouter fallback can easily be throttled quickly.

OpenRouter limits:
https://openrouter.ai/docs/api-reference/limits/

OpenRouter Documentation

Learn about OpenRouter's API rate limits, credit-based quotas, and DDoS protection. Configure and monitor your model usage limits effectively.

heavy island
#

谢谢你耐心的指点。。。。

另外我还有个问题:如果我想使用hermes agent用于编程,我本身对编程是0基础,所以我不会编程

我想实现的目标:通过hermes agent搭建一套编程的多agent team,只要我说出需求,他就能理解拆分我的需求,输出代码,前端设计师,多模型验证测试代码,审核代码安全性,运营测试,readme文档书写,以达到软件程序100%可运行,这样的目标可以防止单agent产生幻觉和过度自信导致的代码bug不能运行

多agent team包含职位:产品经理,代码技术架构专家,全栈工程师,前端设计师,代码审核工程师,测试工程师,运营工程师,readme文档工程师

我不知道这样的一套系统如何搭建以及模型配置,请给我这样的一套方案,谢谢你,非常感谢,这算是一种普通人的技术平权

tight kettleBOT
# heavy island 谢谢你耐心的指点。。。。 另外我还有个问题:如果我想使用hermes agent用于编程,我本身对编程是0基础,所以我不会编程 我想实现的目标:通过...

Thank you for your patient guidance. . . .

In addition, I have another question: If I want to use hermes agent for programming, I have zero basic knowledge of programming, so I don’t know how to program.

What I want to achieve: Build a programming multi-agent team through Hermes agent. As long as I tell my needs, he can understand and split my needs, output code, front-end designer, multi-model verification test code, audit code security, operational testing, and readme document writing to achieve 100% runnability of the software program. This goal can prevent a single agent from hallucinations and overconfidence caused by code bugs that cannot run.

The multi-agent team includes positions: product manager, code technical architecture expert, full stack engineer, front-end designer, code review engineer, test engineer, operations engineer, readme document engineer

I don’t know how to build such a system and how to configure the model. Please give me such a solution. Thank you. Thank you very much. This is a kind of technical equality for ordinary people.

vale crater
#

可以做,但要把预期放现实一点。

Hermes 更适合做成“一个主控 agent + 少量子 agent 的流水线”,而不是一套真的像公司组织图那样长期在线、自动保证 100% 可运行的软件团队。多 agent 确实能降低单 agent 的幻觉和过度自信,但不能把这些问题彻底消掉,所以它更像是提高成功率和减少低级错误,不是替代验收。

如果你想做编程型 multi-agent team,最实用的做法不是一开始就上产品经理、架构、前端、全栈、安全、测试、运营、README 这么多角色,而是先收敛成四段流程:主控负责拆需求和排顺序,实施负责写代码,审查负责查明显 bug 和安全问题,测试/文档负责跑测试和补 README。这样已经能明显比单 agent 稳很多,而且不会把成本、上下文噪音和互相打架的问题放大得太夸张。

模型上也不要理解成“模型越多越安全”。通常是一 个更强的主模型负责主控和核心实现,再让子 agent 去做审查、测试、文档这类相对收敛的工作,会比一堆弱模型互相投票更靠谱。前端设计如果真有 UI 页面,再单独加一个偏设计的 agent 就够了;运营这种角色不适合默认放进每一次开发链路里。

如果你是 0 基础用户,Hermes 能帮你把“不会写代码”这件事往前推进很多,但不能承诺你只说一句需求,就稳定产出完全可上线、完全无 bug、完全不需要你判断的软件。更真实的说法是:它可以把你从“完全不会做”推进到“可以开始做、能持续迭代、能更快发现问题”。这已经很有价值了,也确实是一种技术门槛的下降。

tight kettleBOT
# vale crater 可以做,但要把预期放现实一点。 Hermes 更适合做成“一个主控 agent + 少量子 agent 的流水线”,而不是一套真的像公司组织图那样长期在线、自...

It can be done, but be realistic about your expectations.

Hermes is more suitable for being a "pipeline of a master agent + a small number of sub-agents", rather than a software team that is really online for a long time and automatically guarantees 100% operability like a company's organizational chart. Multi-agent can indeed reduce the illusion and overconfidence of a single agent, but it cannot completely eliminate these problems, so it is more like improving the success rate and reducing low-level errors, rather than replacing acceptance.

If you want to build a programming-based multi-agent team, the most practical approach is not to take on the roles of product manager, architecture, front-end, full-stack, security, testing, operations, and README from the beginning, but to first converge into four processes: the master is responsible for breaking down requirements and sorting them, the implementation is responsible for writing code, the review is responsible for finding obvious bugs and security issues, and the test/documentation team is responsible for running tests and patching README. This is already significantly more stable than a single agent, and will not exaggerate the problems of cost, contextual noise, and mutual fighting.

Don’t understand the model as “the more models, the safer”. Usually a stronger main model is responsible for main control and core implementation, and then letting the sub-agents do relatively convergent work such as review, testing, and documentation will be more reliable than a bunch of weak models voting for each other. If the front-end design really has a UI page, it is enough to add a separate design-oriented agent; the role of operation is not suitable to be included in every development link by default.

If you are a zero-based user, Hermes can help you move forward a lot with "not being able to write code", but it cannot promise that you can stably produce software that is fully online, completely bug-free, and does not

#

require your judgment at all just by saying your requirements. A more realistic statement is: it can advance you from "not able to do it at all" to "can start doing it, can continue to iterate, and can find problems faster." This is already very valuable, and it is indeed a lowering of the technical threshold.

heavy island
#

十分感谢你通俗易懂的解答

tight kettleBOT
vale crater
#

不客气,也谢谢你一路把信息补得这么完整,这样其实很有助于把问题一步一步理清。

这边我先把这个线程按已处理收掉了。以后如果你在 Gemini 多 key 轮换、fallback、或者用 Hermes 搭建编程工作流这几个方向上还有新的问题,直接再开一个新线程就行,我们再继续跟进。