WARP introduces a weight-space analysis method designed to recover the training data portfolios of foundation models. Unlike membership inference which focuses on individual samples, WARP aims to characterize the global training distribution and domain mixture weights. This approach addresses the access asymmetry caused by the lack of disclosed data recipes in public model releases.
Read original
huggingface/daily-papers