PCCX is an open NPU architecture for memory-bound Transformer inference on edge FPGAs, focused on GEMM/GEMV, KV-cache, W4A8 quantization, and custom ISA scheduling.
fpga neural-network parallel-computing transformer rtl isa deeplearning systemverilog computer-architecture quantization gemm inference-engine npu edge-ai hardware-accelerator gemv llm llm-inference llm-accelerator
-
Updated
Apr 29, 2026 - SystemVerilog