I'd need to restrict specific GPU models within a VRAM tier for our endpoints when activating workers on a specific endpoint via API. Specifically, we want a 96GB VRAM endpoint that includes "NVIDIA RTX PRO 6000 Blackwell Server Edition" and "NVIDIA RTX PRO 6000 Blackwell Workstation Edition" but excludes "NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition."
We have tried multiple API approaches and couldn't make it work. Are we missing something?
Attempt 1 — REST POST gpuTypeIds (create new endpoint):
POST https://rest.runpod.io/v1/endpoints
{
"name": "...",
"templateId": "...",
"gpuTypeIds": ["NVIDIA RTX PRO 6000 Blackwell Server Edition", "NVIDIA RTX PRO 6000 Blackwell Workstation Edition"],
"workersMin": 0,
"workersMax": 2,
"idleTimeout": 5,
"scalerType": "QUEUE_DELAY",
"scalerValue": 4
}
Result: HTTP 500 — "gpuId(s) is required for a gpu endpoint" — the REST POST internally wraps a GraphQL saveEndpoint mutation which requires a gpuIds tier preset field that is not exposed in the REST schema.
Attempt 2 — REST POST with gpuIds added:
Adding "gpuIds": "GPU_96GB" to the body above.
Result: HTTP 400 — "Extra input keys provided: 'gpuIds'" — the REST schema explicitly rejects this field, yet internally requires it.
Attempt 3 — REST PATCH gpuTypeIds (update existing endpoint):
PATCH https://rest.runpod.io/v1/endpoints/{id}
{ "gpuTypeIds": ["NVIDIA RTX PRO 6000 Blackwell Server Edition", "NVIDIA RTX PRO 6000 Blackwell Workstation Edition"] }
Result: HTTP 200, success response. However, the endpoint's GPU configuration in the dashboard is unchanged — the VRAM tier cluster still shows all three GPU variants and the individual GPU types filter remains empty. The field is accepted but has no effect on actual GPU allocation.
Attempt 4 — GraphQL saveEndpoint with individual GPU model names:
Using gpuIds with individual model name strings instead of tier preset IDs. These are silently ignored and the endpoint reverts to the tier preset.