Thing is, you can trade off speed for quality. For coding support you can settle for Llama 3.2 or a smaller deepseek-r1 and still get most of what you need on a smaller GPU, then scale up to a bigger model that will run slower if you need something cleaner. I’ve had a small laptop with 16 GB of total memory and a 4060 mobile serving as a makeshift home server with a LLM and a few other things and… well, it’s not instant, but I can get the sort of thing you need out of it.
Sure, if I’m digging in and want something faster I can run something else in my bigger PC GPU, but a lot of the time I don’t have to.
Like I said below, though, I’m in the process of trying to move that to an Arc A770 with 16 GB of VRAM that I had just lying around because I saw it on sale for a couple hundred bucks and I needed a temporary GPU replacement for a smaller PC. I’ve tried running LLMs on it before and it’s not… super fast, but it’ll do what you want for 14B models just fine. That’s going to be your sweet spot on home GPUs anyway, anything larger than 16GB and you’re talking 3090, 4090 or 5090, pretty much exclusively.
You didn’t, I did. The starting models cap at 24, but you can spec up the biggest one up to 64GB. I should have clicked through to the customization page before reporting what was available.
That is still cheaper than a 5090, so it’s not that clear cut. I think it depends on what you’re trying to set up and how much money you’re willing to burn. Sometimes literally, the Mac will also be more power efficient than a honker of an Nvidia 90 class card.
Honestly, all I have for recommendations is that I’d rather scale up than down. I mean, unless you also want to play kickass games at insane framerates with path tracing or something. Then go nuts with your big boy GPUs, who cares.
But for LLM stuff strictly I’d start by repurposing what I have around, hitting a speed limit and then scaling up to maybe something with a lot of shared RAM (including a Mac Mini if you’re into those) and keep rinsing and repeating. I don’t know that I personally am in the market for AI-specific muti-thousand APUs with a hundred plus gigs of RAM yet.