Show HN: SPICE simulation → oscilloscope → verification with Claude Code (lucasgerads.com)

by _fizz_buzz_ 35 comments 121 points
Read article View on HN

35 comments

[−] iterateoften 28d ago
Beware. I had Claude code with opus building boards and using spice simulations. It completely hallucinated the capabilities of the board and made some pretty crazy claims like I had just stumbled onto the secret hardware billion dollar project that every home needed.

None of the boards worked and I had to just do the project in codex. Opus seemed too busy congratulating itself to realize it produced gibberish.

[−] _fizz_buzz_ 28d ago
I haven't tried it with codex yet. But my approach is currently a little bit different. I draw the circuit myself, which I am usually faster at than describing the circuit in plain english. And then I give claude the spice netlist as my prompt. The biggest help for me is that I (and Claude) can very quickly verify that my spice model and my hardware are doing the same thing. And for embedded programming, Claude automatically gets feedback from the scope and can correct itself. I do want to try out other models. But it is true, Claude does like to congratulate itself ;)
[−] ezst 28d ago
It's because you are holding it wrong!

--courtesy for all the LLM pushers so they don't have to bother commenting on this one

[−] ZihangZ 28d ago
This matches what I've seen too — the hallucination gets much worse when the loop has no external verifier. "Does this board work?" has no ground truth inside the model, so it defaults to optimistic narration.

What OP is doing here is actually the mitigation: SPICE + scope readout is a verifier the model can't talk its way past. The netlist either simulates or it doesn't, the waveform either matches or it doesn't. That closes the feedback loop the same way tests close it for code.

The failure mode that remains, in my experience, is a layer down: when the verifier itself errors out (SPICE convergence failure, missing model card, wrong .include path), the agent burns turns "reasoning" about environment errors it has seen a hundred times.That's where most of the token budget actually goes, not the design work.

[−] jddj 28d ago
What throws me about this comment is the missing space between the period and the T in the last sentence.

Did the model itself do that? Was it a paste error?

[−] svnt 27d ago
I’ve also noticed Gemini and Claude occasionally mixing terms recently (eg revel vs reveal) and can’t decide whether it is due to cost optimization effects or some attempt to seem more human.

I can’t recall either using a wrong word prior this month for some time.

[−] lambda 27d ago
Or just because mistakes are part of the distribution that it's trained on? Usually the averaging effect of LLMs and top-k selection provides some pressure against this, but occasionally some mistake like this might rise up in probability just enough to make the cutoff and get hit by chance.

I wouldn't really ascribe it to any "attempt to seem more human" when "nondeterministic machine trained on lots of dirty data" is right there.

[−] svnt 27d ago
Sure, but if that were the case why has it gotten worse recently? I would expect it to be as a result of cost optimization or tradeoffs in the model. I suppose it could be an indicator of the exhaustion of high quality training data or model architecture limitation. But this specific example, revel vs reveal, is almost like going back to GPT-2 reddit errors.

I also don’t want to pretend there is no incentive for AI to seem more human by including the occasional easily recognized error.

[−] lambda 27d ago
Or just the models are getting bigger and better at representing the long tail of the distribution. Previously errors like this would get averaged away more often; now they are capable of modelling more variation, and so are picking up on more of these kinds of errors.
[−] svnt 27d ago
That makes sense, but what is the solution?
[−] jddj 27d ago
Looking at the account's other comment there are subtle grammatical errors in that one too.

Would be good to see the prompt out of morbid curiosity

[−] varispeed 28d ago
This week I tried to use Opus to analyse output from an oscilloscope and it was impossible to complete, because Python scripts (Opus wrote itself) were flagged for cyber security risk. Baffling.
[−] andrewklofas 28d ago
Hit this exact wall six months back building Claude Code stuff for KiCad review[1]. First pass let Claude read .kicad_sch directly via grep/read. It happily invented pin numbers that didn't exist. Rewrote it with Python analyzers that spit out JSON, now Claude just reads the JSON, problem mostly went away.

Curious how spicelib-mcp handles models that aren't in the bundled library. Do you pass the .lib path as a tool arg, or does the server own a registry?

[1] https://github.com/aklofas/kicad-happy

[−] _fizz_buzz_ 28d ago
Spicelib really just makes calls to the selected spice engine (in my case ngspice). In this setup spicelib‘s main job is to parse the raw spice data and have a unified interface regardless which spice engine is selected. But to answer the question: the path to the spice model must currently be set explicitly.
[−] foreman_ 28d ago
[flagged]
[−] jLaForest 28d ago
very cool, im working on a similar kicad tool for dong the fully schematic generation and pcb layout using python generated by AI. Not quite ready to publish it yet, but im glad im not the only one who sees the potential of AI generated code + kicad
[−] megaphone 24d ago
Different domain but similar pattern — I hit a related wall building an MCP server for Claude Desktop (Obsidian-based memory system). Not a registry question, but adjacent: how much state the server should own vs. pass through.

The thing that bit me hardest wasn't architectural though, it was a hardcoded 60-second tool call timeout in the MCP SDK used by Claude Desktop. app.asar confirms it — no config knob to raise it. For any long-running tool (mine: extracting and summarizing a 50-page PDF) the only option is detached spawn: Phase 1 kicks off work and returns "queued" within 60s, Phase 2 runs fire-and-forget and writes results to disk for a later kioku_list call to pick up.

If your server ever does work that might exceed ~45 seconds on Desktop, worth designing that in early. Claude Code's CLI doesn't have this limit, but Desktop users will hit it.

[−] Eextra953 28d ago
Nice scope! I had a similar experience with using Claude to automate circuit design/simulation/optimization and found that they are not good at it. They are surprisingly good at taking raw files and describing what is in them, but they fall apart when trying to do anything other than design the simplest circuit. I think it is because they have no concept of the physics behind a circuit, so they cannot make changes that a designer would make. For optimizing a circuit using, say, an EM simulator, they don't know what to tweak and how to tweak it. In the end, I had to write a script to talk to the simulator and create a config file that specified the bounds of the simulation: step size, optimization algorithm, min, max, etc. Only then could I use an agent to call the script to optimize the circuit.
[−] _fizz_buzz_ 27d ago
Yeah, taking the spice list as the starting point works much better, imo. I also prepopulate the CLAUDE.md file with some information like the pinout/pinmux of the MCU otherwise claude might run in circles trying targeting the wrong pin (to be fair that also happens to me, lol).
[−] Scene_Cast2 28d ago
I've found that having LLMs work with mermaid diagrams makes describing and modifying circuits less annoying.
[−] hulitu 28d ago
Measure with a micrometer, mark with a pencil, cut with an axe.
[−] Archit3ch 28d ago
Nice! Doing something similar with a Jumperless so that the model can reconfigure the circuit on the fly.
[−] _fizz_buzz_ 28d ago
Oh, I remember seeing Jumperless a while ago, but completely forgot about. Combining this with something like Jumperless does sound interesting. What does your setup look like? Does Claude tell you: "try 1k resistor in parallel here"?
[−] Archit3ch 28d ago
It's just measurements for now. But sourcing ideas from the model could be interesting!
[−] Schlagbohrer 28d ago
Great use case!
[−] skyberrys 28d ago
This is an interesting use case with Claude. It sounds like you took away some tedious work with the checking of waveforms, and you are able to speed up your design loop because of it.
[−] leninkatta 24d ago
What I like here is the shift from better prompts to better feedback. With SPICE and scope data in the loop, it can actually iterate instead of guessing. The file based approach is also a small detail that makes a big difference in practice
[−] dharma1 28d ago
this kind of thing is super cool to close the loop.

waiting for FPAA to get better so we can vibecode analog circuits

https://www.eetimes.com/podcasts/making-analog-chip-designs-...

[−] analog_daddy 26d ago
I mean yeah FPAA’s would be awesome and I used to wish for something like it coming from a discrete analog hobby electronics.

But in a my short two years in Analog IC design industry, i have been so divorced from the actual silicon that I rarely got a chance yet to go in lab and probe around the teeny tiny block I worked on in the complex labyrinth of the SoC. I don’t wish for it (I learnt the hard way, be careful what you wish for; and in this case, if I am in lab debugging something in silicon, means something terrible has happened to what I worked on and it might have cost the company about $200k or more), but someday soon i will get into the lab just to play around with the fancy ass oscilloscope.

In the meanwhile, I did realize the invaluable power of having a python frontend API for querying basic details of your devices. (Python and not SKILL/Lisp since it pretty much works with any AI, and is very well worked on) and AI has been okayish with it. I feel AI would be a good aid in actual circuit design if it understood the Topology of the circuit, which at this point I am tempted to say might require something akin to AST but for SPICE. However, AI has been awesome at regexes and scripting which is also the meh and boring part of the circuit design process.

[−] talsania 25d ago
The AST idea for spice is something i've thought about too. a netlist is already a graph, the LLM just can't see it that way when it's flat text. serializing it with topology intact, adjacency, port polarities, device semantics is basically what your python frontend is doing implicitly, which explains why it behaves so much better than dumping a raw netlist into the prompt.
[−] walski 28d ago

> SPICE (Simulation Program with Integrated Circuit Emphasis) is a general-purpose, open-source analog electronic circuit simulator. [1]

1: https://en.wikipedia.org/wiki/SPICE

[−] kleene_op 28d ago
Really nice. My mother is an applied Physics teacher, and she told me they had a hard time at work figuring out how they could connect their teaching material to LLM in a relevant way. This should be useful to her.
[−] hexo 28d ago
Heh, this is like the last thing you need claude for. I mean, you have eyes and brain.
[−] folays 26d ago
CLAUDE.md : Make no mistakes. Write good code, don't write bad code. Also never changes the impedance input to 50 Ω.
[−] mystraline 27d ago
Can we start pivoting to local LLM integration rather than choosing a service that has something like 5 rug-pulls?

Ye'ol poop splatter (Claude) is getting worse, more expensive, and anti-user. Local may be slower, but it is where the future of LLMs are going to.

[−] fredcallagan 23d ago
[dead]
[−] pixelsort 27d ago
[dead]
[−] alex1sa 28d ago
[flagged]
[−] vomayank 28d ago
[flagged]
[−] _fizz_buzz_ 28d ago
Claude can absolutely correct itself and change the source code on the MCU and adapt. However, it also does make mistakes, such as claiming it matched the simulation when it obviously didn't. Or it might make dubious decisions e.g. bit bang a pin instead of using the dedicated uart subsystem. So, I don't let it build completely by itself.
[−] vomayank 27d ago
[flagged]
[−] redoh 28d ago
[flagged]
[−] yashjadhav2102 28d ago
[dead]
[−] strimoza 28d ago
[dead]
[−] pukaworks 27d ago
[dead]