#Hey there,

1 messages · Page 1 of 1 (latest)

steep spade
#

I'm running magic-llama-3.1 on an Threadripper 1920x and a RTX3070Ti with Ubuntu 24.04.1 LTS.
I have a few errors/questions:

  • with "magic run serve --huggingface-repo-id modularai/llama-3.1"
    i got this right at the Beginning "INFO: MainThread: root:

    Estimated memory consumption:
    Weights: 4693 MiB
    KVCache allocation: 128 MiB
    Total estimated: 4821 MiB used / unknown MiB free

    Current batch size: 1
    Current max sequence length: 512
    Max recommended batch size for current sequence length: unknown
    "

why is the estimated memory consumption unknown? same for recommended batch size...

  • Why is only one of my NUMA nodes in use without the gpu flag?

  • gpu flag isn't working on driver 550.120