#Setting up local Gemma4 26B A4B

1 messages · Page 1 of 1 (latest)

broken basin
#

what should i change in the openclaw.json to make it worl perfectly?
"models": {
"mode": "merge",
"providers": {
"local-gemma": {
"baseUrl": "http://127.0.0.1:8082/v1",
"apiKey": "not-needed",
"api": "openai-completions",
"models": [
{
"id": "Gemma4-26B-A4B",
"name": "Google Gemma 4 26B A4B",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 262144,
"maxTokens": 8192
}
]
},

gusty dewBOT
#

Not “perfectly” 😅, but for the best chance with a local Gemma OpenAI-compatible backend, I’d change/add these:

{
  agents: {
    defaults: {
      model: { primary: "local-gemma/Gemma4-26B-A4B" },
      models: {
        "local-gemma/Gemma4-26B-A4B": { alias: "Gemma" },
      },
    },
  },

  models: {
    mode: "merge",
    providers: {
#
      "local-gemma": {
        baseUrl: "http://127.0.0.1:8082/v1",
        apiKey: "gemma-local",
        api: "openai-completions",
        models: [
          {
            id: "Gemma4-26B-A4B",
            name: "Google Gemma 4 26B A4B",
            reasoning: false,
            input: ["text"],
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 262144,
            maxTokens: 8192,
            compat: {
              requiresStringContent: true,
#
              supportsTools: false
            }
          }
        ]
      }
    }
  }
}

Most important bits:

  • agents.defaults.model.primary so OpenClaw actually uses it
  • compat.requiresStringContent: true for stricter local /v1 backends
  • compat.supportsTools: false because Gemma/local OpenAI-compatible servers often break on full agent tool schemas
broken basin
#

it should support not only text but even other formats, like image and audio

gusty dewBOT
#

Yep, but there’s an important catch:

  • input in models.providers...models[] only supports "text" and "image"
  • "audio" is not valid there
  • Audio support is configured separately under tools.media.audio

So for your local Gemma block, the multimodal version would look more like:

{
  agents: {
    defaults: {
      model: { primary: "local-gemma/Gemma4-26B-A4B" },
    },
  },
#
  models: {
    mode: "merge",
    providers: {
      "local-gemma": {
        baseUrl: "http://127.0.0.1:8082/v1",
        apiKey: "gemma-local",
        api: "openai-completions",
        models: [
          {
            id: "Gemma4-26B-A4B",
            name: "Google Gemma 4 26B A4B",
            reasoning: false,
            input: ["text", "image"],
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 262144,
#
            maxTokens: 8192,
            compat: {
              requiresStringContent: true,
              supportsTools: false
            }
          }
        ]
      }
    }
  },

  tools: {
    media: {
      image: {
        enabled: true
#
      },
      audio: {
        enabled: true
      }
    }
  }
}

Two important notes:

  1. Image

    • input: ["text", "image"] is correct only if your Gemma server on :8082 actually exposes vision
    • if the backend is text-only, config alone won’t make images work
  2. Audio

#
  • audio is not handled by input
    • use tools.media.audio
    • if you want reliable audio, you usually add a transcription model or CLI fallback, for example OpenAI/Groq/Deepgram/Whisper

So the short answer is:

  • change input to ["text", "image"] for image support
  • add tools.media.audio.enabled: true for audio processing
  • do not add "audio" to the model input array

If you want, paste your full openclaw.json plus:

openclaw --version
openclaw models status