NVIDIA has released Nemotron-Labs-TwoTower, an open-weight diffusion language model utilizing a hybrid Mamba-2, attention, and MoE backbone based on Nemotron-3-Nano-30B-A3B. The architecture employs a "two-tower" approach, separating a frozen autoregressive context tower for clean token processing from the denoising process to resolve weight conflict. This design aims to optimize the dual requirements of representing clean context and denoising noisy tokens.

Read original