Nvidia Launches Vera CPU, Purpose-Built for Agentic AI (nvidianews.nvidia.com)

by lewismenelaws 101 comments 179 points
Read article View on HN

101 comments

[−] WhitneyLand 61d ago
Agentic AI CPU? No.

It’s a CPU designed for an AI cluster. Their last CPU Grace was the same thing and no one called it agentic.

Vera now just has more performance/more bandwidth. It’s cool, I’d like to have one of these clusters, but this is not new.

It’s marketed as agentic AI because that’s fashionable in 2026.

[−] storus 61d ago
They significantly lowered latency compared to EPYC/Xeon, which is critical for streaming agents (e.g. text/audio/video agents).
[−] stingraycharles 61d ago
What latency? How much is it compared to LLM inference speed?
[−] storus 60d ago
See the Redpanda comment/link here.
[−] PeterCorless 61d ago
This is the related benchmark blog from Redpanda [disclosure: I work for Redpanda and I helped write this. Credit to Travis Downs & others at Redpanda for the heavy lifting on the testing and analysis.]

https://www.redpanda.com/blog/nvidia-vera-cpu-performance-be...

[−] jauntywundrkind 61d ago
Given the price of these systems the ridiculously expensive network cards isn't such a huge huge deal, but I can't help but wonder at the absurdly amazing bandwidth hanging off Vera, the amazing brags about "7x more bandwidth than pcie gen 6" (amazing), but then having to go to pcie to network to chat with anyone else. It might be 800Gbe but it's still so many hops, pcie is weighty.

I keep expecting we see fabric gains, see something where the host chip has a better way to talk to other host chips.

It's hard to deny the advantages of central switching as something easy & effective to build, but reciprocally the amazing high radix systems Google has been building have just been amazing. Microsoft Mia 200 did a gobsmacking amount of Ethernet on chip 2.8Tbps, but it's still feels so little, like such a bare start. For reference pcie6 x16 is a bit shy of 1Tbps, vaguely ~45 ish lanes of that.

It will be interesting to see what other bandwidth massive workloads evolve over time. Or if this throughout era all really ends up serving AI alone. Hoping CXL or someone else slims down the overhead and latency of attachment, soon-ish.

Maia 200: https://www.techpowerup.com/345639/microsoft-introduces-its-...

[−] baal80spam 61d ago
Say what you want about NVIDIA (to me they are just doing what every company would do in their place), but they create engineering marvels.
[−] d_silin 61d ago
It is a 88-core ARM v9 chip, for somewhat more detailed spec.
[−] gcanyon 61d ago
Anyone know how this compares to Apple’s M5 chips? Or is that comparison apples to oranges.
[−] tencentshill 61d ago
So does this cut out Intel/x86 from all the massive new datacenter buildouts entirely? They've already lost Apple as a customer and are not competitive in the consumer space. I don't see how they can realistically grow at all with x86.
[−] RantyDave 61d ago
Ahhh, so is this a chip "more optimised" for connecting GPU's to reality ... or are they skipping the GPU step entirely? Are GPU's only for training now?
[−] dmitrygr 61d ago

> Purpose-Built for Agentic AI

From the "fridge purpose-built for storing only yellow tomatoes" and "car only built for people whose last name contains the letter W" series.

When can this insanity end? It is a completely normal garden-variety ARM SoC, it'll run Linux, same as every other ARM SoC does. It is as related to "Agentic $whatever" as your toaster is related to it

[−] rka128 61d ago
"democratize access to AI and accelerating innovation."

So they make inference cheaper and the models get even worse. Or Jensen Huang has AI psychosis. Or both.

Here is a new business idea for Nvidia: Give me $3000 in a circular deal which I will then spend on a graphics card.

[−] rishabhaiover 61d ago
I'm assuming this is for tool call and orchestration. I didn't know we needed higher exploitable parallelism from the hardware, we had software bottlenecks (you're not running 10,000 agents concurrently or downstream tool calls)

Can someone explain what is Vera CPU doing that a traditional CPU doesn't?

[−] recvonline 61d ago
Does this mean their gaming GPUs are becoming less in demand, and therefore cheaper/more available again?
[−] yalogin 61d ago
This is yet not the grok acquisition, so there is another update coming with that claiming more improvements?
[−] ksec 59d ago
The most interesting part is that Nvidia intend to sell this CPU separately, meaning you dont need to buy Nvidia GPU to use it.

Other than Hyperscaler ARM has yet to enter the server market and it might well be Nvidia that makes a different.

[−] kibibu 61d ago
Am I crazy, or is Jensen's statement a copy-paste from ChatGPT?

(Could be both)

[−] akomtu 61d ago
They should've called it Vega: https://doom.fandom.com/wiki/VEGA
[−] FridgeSeal 61d ago
Are we rapidly careening towards a world where _only_ AI “computing” is possible?

Wanted to do general purpose stuff? Too bad, we watched the price of everything up, and then started producing only chips designed to run “ai” workloads.

Oh you wanted a local machine? Too bad, we priced you out, but you can rent time with an ai!

Feels like another ratchet on the “war on general purpose computing” but from a rather different direction.

[−] simulator5g 61d ago
The World's First Central Sloppressing Unit
[−] _s_a_m_ 60d ago
what a bizarre title
[−] dude250711 61d ago
A GPU purpose-built for Slop.
[−] jal505 57d ago
I think you're right - the Tauri vs Electron comparison isn't quite the same scale of difference.

Both still run web tech in a wrapper, just with different performance characteristics. The local-first vs cloud distinction is more fundamental, especially for tools that interact with platforms like LinkedIn.

When I built ZenMode, the core insight was that LinkedIn can easily detect automation coming from AWS/datacenter IPs, but when your desktop app uses your actual Chrome browser and home IP, it's indistinguishable from manual usage.

That's why we went with an Electron/Puppeteer architecture running locally rather than yet another cloud service. Check it out at https//zen-mode.io if you're curious about the local execution model.

[−] felixsells 60d ago
[flagged]