I'm trying to determine the best-accuracy Llama-3.3-70B model in the Unsloth family (https://huggingface.co/collections/unsloth/llama-33-all-versions) for running on an A100 80GB GPU. Here's what is returned when I declare the following models in my Python code:
1:
MODEL = "unsloth/Llama-3.3-70B-Instruct-bnb-4bit" # https://huggingface.co/unsloth/Llama-3.3-70B-Instruct-bnb-4bit
which downloads in ~/.cache/huggingface/hub: models--unsloth--llama-3.3-70b-instruct-bnb-4bit
Seems correct.
2:
MODEL = "unsloth/Llama-3.3-70B-Instruct" # https://huggingface.co/unsloth/Llama-3.3-70B-Instruct
which downloads in ~/.cache/huggingface/hub: models--unsloth--llama-3.3-70b-instruct-unsloth-bnb-4bit
Misleading directory name? Actually fp16/bf16, not 4-bit? Thanks for any clarity with this issue.
We’re on a journey to advance and democratize artificial intelligence through open source and open science.