Quantization for CNN Inference on FPGA

What Is Quantization? In signal processing, quantization is the process of mapping a continuous range of values to a discrete (integer) set. In deep learning and hardware acceleration, it specifically refers to: Converting floating-point (FP32, FP16, or BF16) model weights and activations to lower-bit integers (INT8, INT4, etc.) in order to reduce memory footprint and computational cost. Quantization is the bridge that makes neural networks practical on resource-constrained hardware. Why Quantization Is Essential for FPGA CNN Implementation FPGAs have a fixed amount of logic, DSP slices, and BRAM. Floating-point arithmetic is expensive in all three dimensions: ...

November 24, 2025 · 3 min · EasyFPGA

LeNet-5 Implementation on FPGA: An Overview

What Is LeNet-5? LeNet-5 is a convolutional neural network (CNN) proposed by Yann LeCun and colleagues in 1998 for handwritten digit recognition (the MNIST dataset). It is widely regarded as the historical model that established the foundational concepts of modern deep learning: convolution, pooling, and hierarchical feature extraction. LeNet-5 architecture as shown in (LeCun et al., 1998) MNIST — 70,000 greyscale 28×28 images of handwritten digits ...

November 23, 2025 · 4 min · EasyFPGA