M3 Ultra Is Slower Than M2 Ultra for Language Models

rbanffy 9 hours ago

Marginally, in some models only, and only in token generation, if I read the tables correctly.

The models are also not huge, negating the advantage coming from the memory size on both machines (192GB on the M2 Ultra, 512GB on the M3 Ultra), and memory bandwidth is more or less the same, which might starve the computing resources of data to crunch.

behnamoh 9 hours ago

> Marginally, in some models only, and only in token generation, if I read the tables correctly.
Which is exactly the point. If the benefits are so marginal, why not get a used M2 Ultra instead?