Apple approves driver that lets Nvidia eGPUs work with Arm Macs (twitter.com)

by naves 242 comments 527 points
Read article View on HN

242 comments

[−] MrArthegor 41d ago
A good technical project, but honestly useless in like 90% of scenarios.

You want to use an NVidia GPU for LLM ? just buy a basic PC on second hand (the GPU is the primary cost anyway), you want to use Mac for good amount of VRAM ? Buy a Mac.

With this proposed solution you have an half-backed system, the GPU is limited by the Thunderbolt port and you don’t have access to all of NVidia tool and library, and on other hand you have a system who doesn’t have the integration of native solution like MLX and a risk of breakage in future macOS update.

[−] afavour 41d ago
Chicken/egg. NVidia tooling is lacking surely in part because the hardware wasn’t usable on macOS until now. Now that it’s usable that might change.
[−] fakebizprez 36d ago
Wrong.

If a model can run on a 512GB M3 Ultra via MLX or CUDA, but simultaneously benefit from the memory bandwidth of something like an RTX 6000 Pro; that would save my company hundreds of thousands of dollars. That's $20,000 for roughly 600GB of VRAM, and enough token generation speed to fulfill the needs of any enterprise that's not a hyperscaler or neocloud.

I'll let someone else do the math for you on what it costs to put together a 10U server to get that kind of performance without the $10K M3 Ultra Studio.

What we're paying for five old 80GB A100s is criminal, but it's nothing compared to what these GB200 Blackwell setups are going to cost in 2030. Market economics aside, the fact that they require sophisticated liquid cooling infrastructure and draw 3x the power of the A100s, will make these cards unattainable for small to medium organizations.

So yeah, if there's some outside chance that we can pair NVIDIA's speed with a an arm-powered machine that offers 512GB Unified Memory while drawing 50W -- you better believe it's a big deal. We'll see. Sounds too good to be true.

[−] dapperdrake 40d ago
Thank you for opening my mind to a viewpoint I didn’t even know existed.

Yes, for many scenarios this is "not even an academic exercise".

For a very select few applications this is Gold. Finally serious linear algebra crunch for the taking. (Without custom GPU tapeout.)

[−] hank808 39d ago
"Nvidia." Not NVidia or nVidia, or the other ways. I feel that I can frequently figure out if someone is going to express a negative view about this company based only on whether they picked a weird way to write their name.
[−] the_arun 41d ago
I misunderstood eGPU for virtual GPU. But I was wrong it means external GPU.
[−] petters 40d ago

> the GPU is limited by the Thunderbolt port

Not everything is limited by the transfer speed to/from the GPU. LLM inference, for example.

[−] nailer 40d ago

> GPU is limited by the Thunderbolt port

I thought Thunderbolt was like pluggable PCI? The whole point was not to limit peripherals.

[−] MIA_Alive 40d ago
Even with running ML experiments you'd mostly want to run them on rented out clusters anyway
[−] throawayonthe 40d ago
the tooling is just the standard linux tooling inside the container, no? and thunderbolt is not a real limitation
[−] tensor-fusion 41d ago
[flagged]
[−] bangonkeyboard 41d ago
I don't know how Apple has evaded regulatory scrutiny for their refusal to sign Nvidia's eGPU drivers since 2018.
[−] syntaxing 41d ago
From what I understand, only works with Tinygrad. Which is better than nothing but CUDA or Vulkan on pytorch isn’t going to work from this.

[1] https://docs.tinygrad.org/tinygpu/

[−] Keyframe 41d ago
Such a shame both companies are big on vanity to make great things happen. Imagine where you could run Mac hardware with nvidia on linux. It's all there, and closed walls are what's not allowing it to happen. That's what we as customers lose when we forego control of what we purchase to those that sold us the goods.
[−] arjie 41d ago
Woah, this is exciting. I'm traveling but I have a 5090 lying around at home. I'm eager to give it a go. Docs are here: https://docs.tinygrad.org/tinygpu/

I hope it'll work on an M4 Mac Mini. Does anyone know what hardware to get? You'll need a full ATX PSU to supply power, right? And then tinygrad can do LLM inference on it?

[−] mlfreeman 41d ago
I followed the instructions link and read the scripts...although the TinyGPU app is not in source form on GitHub, this looks to me like the GPU is passed into the Linux VM underneath to use the real driver and then somehow passed back out to the Mac (which might be what the TinyGrad team actually got approved).

Or I could have totally misunderstood the role of Docker in this.

[−] ajdegol 40d ago
I think that metal isn’t double precision; so that limits some serious physics simming; but if you’re doing that I guess you just rent a gpu somewhere.

I would definitely be into this if adding an egpu was first class supported.

[−] eoskx 41d ago
Interesting, but cannot run CUDA or more to the point nvidia-smi.
[−] wmf 41d ago
Pretty misleading. This driver is only for compute not graphics.
[−] EagnaIonat 40d ago

> If you have a Thunderbolt or USB4 eGPU and a Mac, today is the day you've been waiting for!

I got an eGPU back in 2018 and could never get it to work. To the point that it soured me from doing it again.

These days for heavy duty work I just offload to the cloud. This all feels like NVidia trying to be relevant versus ARM.

[−] vondur 41d ago
If you could get Nvidia driver support on Mac’s I bet Apple would have sold more MacPro’s.
[−] the__alchemist 41d ago
I'm writing scientific software that has components (molecular dynamics) that are much faster on GPU. I'm using CUDA only, as it's the eaisiest to code for. I'd assumed this meant no-go on ARM Macs. Does this news make that false?
[−] brcmthrowaway 41d ago
What are the limitations of USB4/Thunderbolt compared with a regular PCIe slot?
[−] ece 41d ago
Apple should update this page for ARM macs, now runs tinygrad on eGPUs: https://support.apple.com/en-us/102363
[−] frankc 41d ago
My main thought is would this allow me to speed up prompt process for large MoE models? That is the real bottleneck for m3ultra. The tokens per second is pretty good.
[−] amelius 40d ago
These tinyboxes are so expensive (starting at $12,000), why don't they just put a CPU inside and allow users to ssh into them?
[−] dd_xplore 41d ago
Why does Apple need to make the drivers in a walled garden? Atleast they should support major device categories with official drivers.
[−] lowbloodsugar 41d ago
Can I do prefill on the eGPU and the decode on the Mac?
[−] direwolf20 40d ago
Isn't it sad that we've ended up in a situation where we are talking about "Apple approves" rather than "someone creates"? Fuck Apple.
[−] ErenalpCet 40d ago
yes good report
[−] qoez 41d ago
Idk why this doesn't link to the original source instead of this proxy source: https://x.com/__tinygrad__/status/2039213719155310736
[−] surcap526 39d ago
[dead]
[−] surcap526 39d ago
[dead]
[−] vegabook 41d ago
[flagged]
[−] userbinator 41d ago
[flagged]
[−] tensor-fusion 41d ago
[flagged]
[−] bigyabai 41d ago
The opportunity cost of Apple refusing to sign Nvidia's OEM AArch64 drivers is probably reaching the trillion-dollar mark, now that Nvidia and ARM have their own server hardware.