Hey folks — I put together a full walkthrough on how to get Mistral-7B exported to ONNX and running in Unity Sentis 6.2.
For a long time the consensus was “Mistral can’t be exported” because of:
PyTorch 2.x fused scaled_dot_product_attention
Hugging Face’s in-place mask ops (ior)
Grouped Query Attention shape mismatches
This guide shows how to:
✅ Patch the attention layer (remove fused SDPA)
✅ Export clean logits → ONNX (opset 15)
✅ Post-process to remove ior if it sneaks in
✅ Validate + run inference with ONNX Runtime
✅ Drop straight into Unity Sentis 6.2 and get logits out
Full article with copy-paste code:
👉https://huggingface.co/blog/dimentox/exporting-mistral-7b-onnx-unity-sentis-62
If you’re working with large language models in Unity, this should save you a ton of trial and error.