Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe

Researchers identify "Shrinkage Bias" as a fundamental limitation in E2M1 FP4 quantization formats used in next-generation hardware, proposing a new UFP4 recipe to mitigate systematic negative rounding errors during LLM pretraining.

The Challenge of FP4 Pretraining

The pursuit of efficiency in Large Language Model (LLM) pretraining has led to the adoption of 4-bit floating-point (FP4) formats. These formats promise substantial reductions in both memory overhead and computational costs, enabling the training of larger models on existing hardware or faster convergence on next-generation systems. Current hardware architectures, including the NVIDIA Blackwell and Rubin-class systems as well as the AMD MI350-series GPUs, primarily center their FP4 implementation on the E2M1 data element format.

The Geometric Origin of Shrinkage Bias

The study reveals a critical flaw in the reliance on non-uniform formats like E2M1. The researchers identify a phenomenon termed Shrinkage Bias. This is described as a systematic negative rounding error that stems from the geometric asymmetry of the representable bins within the FP4 format. Because the distribution of representable values is not symmetric around the origin, the quantization process consistently biases values toward zero, effectively "shrinking" the weight and gradient magnitudes during the training process.

Systemic Impact on LLM Convergence

This bias is not merely a marginal rounding error but a systemic issue that affects the stability and performance of LLM pretraining. The geometric asymmetry of the E2M1 format introduces a persistent drift that can degrade model convergence and final performance, potentially offsetting the computational gains provided by the reduced precision.

The UFP4 Proposal

To counteract these effects, the authors introduce a new "UFP4 recipe." While the full technical specifications of the recipe are detailed in the complete paper, the objective is to eliminate the inherent Shrinkage Bias found in standard E2M1 formats, thereby providing a more stable and accurate numerical foundation for 4-bit pretraining.

Note: Due to the truncated nature of the provided source text, specific implementation details of the UFP4 recipe and the quantitative results of the comparative benchmarks are not available in this summary.

Original Source
LLM Pretraining FP4 Quantization Numerical Stability E2M1 Shrinkage Bias Hardware Acceleration