Const-me 9 days ago

If I were an enthusiast, I would rather consider a mini PC with AMD Strix Halo APU. These things have been coming soon for a few months now.

The memory is slower but not by much, 256 GB/s is much faster than system memory found in most consumer-targeted PCs. The devices have way more memory, up to 128 GB. A system with a Strix Halo APU is a general-purpose computer; these special accelerator cards can only be used for one thing.

  • fxtentacle 9 days ago

    256 GB/s is excruciatingly slow for LLM interference. The 5090 has roughly 8x as much and since the task is mostly RAM BW bound, performance scales almost linearly with it.

    • reitzensteinm 9 days ago

      There's a sweet spot for running MoE models, though. If you need the entire model in VRAM but only need to retrieve a part of it per token, trading more memory for less bandwidth can be a win.

      I have a 4090, and given the MoE trend, I'd be more tempted to purchase a Strix Halo next than a 5090.

    • Const-me 9 days ago

      The specialized accelerators discussed in the article have much slower memory than a 5090 GPU. The memory in them delivers 448 or 512 GB/s, only around 2x compared to Strix Halo.

fxtentacle 9 days ago

"Both Blackhole cards offer roughly half the memory bandwidth of a used RTX 3090"

And that means I have no idea what these cards could be useful for. They are more expensive, have roughly the same VRAM, but are much slower.

  • Carstairs 9 days ago

    These have the advantage of not being 5 years old with no warranty.