【dkernel:高性能的定制化CUDA内核库,专注于优化大规模语言模型的稀疏注意力计算,提升计算效率和降低资源消耗】'This repo contains customized CUDA kernels written in OpenAI Triton. As of now, it contains the sparse attention kernel used in phi-3-small models.' GitHub: github.com/linxihui/dkernel
【dkernel:高性能的定制化CUDA内核库,专注于优化大规模语言模型的稀疏注
爱生活爱珂珂
2024-12-17 13:40:46
0
阅读:7