1 post
Nemotron-style diffusion LLMs cut decoding time, but agent latency lives in retrieval and tool calls. Here's what actually changes in production.