Besides the usual FP32, it supports FP16, quantized INT4, INT5 and INT8 inference. This project is focused on CPU, but cuBLAS is also supported. This project provides a C library rwkv.h and a ...
Some results have been hidden because they may be inaccessible to you