2024-01-15T21:32:29.206204204Z INFO 01-15 21:32:29 llm_engine.py:73] Initializing an LLM engine with config: model='mistralai/Mistral-7B-v0.1', tokenizer='mistralai/Mistral-7B-v0.1', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=32768, download_dir='/runpod-volume/', load_format=auto, tensor_parallel_size=1, quantization=None, enforce_eager=False, seed=0)
2024-01-15T21:32:29.587369692Z engine.py :56 2024-01-15 21:32:29,586 Error initializing vLLM engine: CUDA driver initialization failed, you might not have a CUDA gpu.
2024-01-15T21:32:29.587459992Z Traceback (most recent call last):
2024-01-15T21:32:29.587477182Z File "/handler.py", line 7, in <module>
2024-01-15T21:32:29.587597751Z vllm_engine = VLLMEngine()
2024-01-15T21:32:29.587676721Z ^^^^^^^^^^^^
2024-01-15T21:32:29.587687568Z File "/engine.py", line 38, in init
2024-01-15T21:32:29.587846707Z self.llm = self._initialize_llm()
2024-01-15T21:32:29.587920123Z ^^^^^^^^^^^^^^^^^^^^^^
2024-01-15T21:32:29.587927757Z File "/engine.py", line 57, in _initialize_llm
2024-01-15T21:32:29.588049626Z raise e
2024-01-15T21:32:29.588066343Z File "/engine.py", line 54, in _initialize_llm
2024-01-15T21:32:29.588169416Z return AsyncLLMEngine.from_engine_args(AsyncEngineArgs(**self.config))
2024-01-15T21:32:29.588340362Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-01-15T21:32:29.588362955Z File "/src/vllm/vllm/engine/async_llm_engine.py", line 496, in from_engine_args
2024-01-15T21:32:29.588594264Z engine = cls(parallel_config.worker_use_ray,
2024-01-15T21:32:29.588675837Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-01-15T21:32:29.588704403Z File "/src/vllm/vllm/engine/async_llm_engine.py", line 269, in init
2024-01-15T21:32:29.588857283Z self.engine = self._init_engine(*args, **kwargs)
2024-01-15T21:32:29.588974769Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-01-15T21:32:29.589017799Z File "/src/vllm/vllm/engine/async_llm_engine.py", line 314, in _init_engine
2024-01-15T21:32:29.589179321Z return engine_class(*args, **kwargs)
2024-01-15T21:32:29.589276287Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-01-15T21:32:29.589306141Z File "/src/vllm/vllm/engine/llm_engine.py", line 110, in init
2024-01-15T21:32:29.589436730Z self._init_workers(distributed_init_method)
2024-01-15T21:32:29.589445070Z File "/src/vllm/vllm/engine/llm_engine.py", line 142, in _init_workers
2024-01-15T21:32:29.589570340Z self._run_workers(
2024-01-15T21:32:29.589578206Z File "/src/vllm/vllm/engine/llm_engine.py", line 763, in _run_workers
2024-01-15T21:32:29.589964835Z self._run_workers_in_batch(workers, method, *args, **kwargs))
2024-01-15T21:32:29.589993004Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-01-15T21:32:29.589998521Z File "/src/vllm/vllm/engine/llm_engine.py", line 737, in _run_workers_in_batch
2024-01-15T21:32:29.590353319Z output = executor(*args, **kwargs)
2024-01-15T21:32:29.590387619Z ^^^^^^^^^^^^^^^^^^^^^^^^^
2024-01-15T21:32:29.590392816Z File "/src/vllm/vllm/worker/worker.py", line 67, in init_model
2024-01-15T21:32:29.590540725Z torch.cuda.set_device(self.device)
2024-01-15T21:32:29.590547462Z File "/usr/local/lib/python3.11/dist-packages/torch/cuda/init.py", line 404, in set_device
2024-01-15T21:32:29.590728911Z torch._C._cuda_setDevice(device)
2024-01-15T21:32:29.590792281Z File "/usr/local/lib/python3.11/dist-packages/torch/cuda/init.py", line 298, in _lazy_init
2024-01-15T21:32:29.590940554Z torch._C._cuda_init()
2024-01-15T21:32:29.590948904Z RuntimeError: CUDA driver initialization failed, you might not have a CUDA gpu.