Diffusion LMs go mainstream, zero-shot model-size interpolation lands, “hippocampus” memory boosts long-context models, and tiny networks beat big ones at recursive reasoning
Machine Learns #56
Diffusion LMs go mainstream, zero-shot model-size interpolation lands, “hippocampus” memory boosts long-context models, and tiny networks beat big ones at recursive reasoning