working-group-ideas | GPU MODE | Page 2

Maxwell Equations Simulator 41 messages

Sep 14, 2024, 7:23 PM Multi-GPU

Add Support for <new type> KV Cache Quantization in TorchAO 3 messages

Sep 14, 2024, 5:07 PM QuantizationSparsity

Optimize Quantization Settings to Fit a Given VRAM Budget 2 messages

Sep 14, 2024, 5:01 PM QuantizationSparsity

Add an activation sparsity kernel to TorchAO 9 messages

Sep 14, 2024, 4:57 PM QuantizationSparsity

Develop Fused Quantized GEMM/GEMV with LoRA 14 messages

Sep 14, 2024, 4:54 PM QuantizationSparsity

Implement an LUT-based n-bit Quantization (nf format) Fused Matmul Kernel 3 messages

Sep 14, 2024, 4:45 PM QuantizationSparsity

Develop an A16W3 (mixed fp16 x 3-bit) Fused Matmul Kernel 10 messages

Sep 14, 2024, 4:43 PM QuantizationSparsity