#Llama-3.3-70B-Instruct Clarity

1 messages · Page 1 of 1 (latest)

tight pulsar
#

I'm trying to determine the best-accuracy Llama-3.3-70B model in the Unsloth family (https://huggingface.co/collections/unsloth/llama-33-all-versions) for running on an A100 80GB GPU. Here's what is returned when I declare the following models in my Python code:

1:

MODEL = "unsloth/Llama-3.3-70B-Instruct-bnb-4bit" # https://huggingface.co/unsloth/Llama-3.3-70B-Instruct-bnb-4bit

which downloads in ~/.cache/huggingface/hub: models--unsloth--llama-3.3-70b-instruct-bnb-4bit

Seems correct.

2:

MODEL = "unsloth/Llama-3.3-70B-Instruct" # https://huggingface.co/unsloth/Llama-3.3-70B-Instruct

which downloads in ~/.cache/huggingface/hub: models--unsloth--llama-3.3-70b-instruct-unsloth-bnb-4bit

Misleading directory name? Actually fp16/bf16, not 4-bit? Thanks for any clarity with this issue.

Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions.

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

topaz rock
#

did you do load in4bit = true? maybe thats why