Show HN: AI Roundtable – Let 200 models debate your question (opper.ai)

by felix089 98 comments 118 points
Read article View on HN

98 comments

[−] totisjosema 52d ago
Which AI lab has higher ethical standards:

https://opper.ai/ai-roundtable/questions/8f5b4f55-617

Do you think its alright that AI labs scraped the internet without respect for copyright and now sell closed models?

https://opper.ai/ai-roundtable/questions/86864de8-251

Very interesting to read the transcripts. And seeing how they manage to convince each other. Opus 4.6 seems to really get the others changing their minds

[−] jacquesm 52d ago
Good questions!
[−] gsandahl 52d ago
Oh lord, imagine asking ”serious” questions

https://opper.ai/ai-roundtable/questions/you-are-standing-in...

[−] zipping1549 52d ago

> However, a clever minority led by Gemini 3.1 Pro and Gemini 3 Pro argued that if the sign is legible from the other side, it must be intended to lead people

into the current room to find the exit, making the inscribed corridor the one leading deeper into the dungeon.

This is quite impressive, really.

[−] gsandahl 52d ago
Agree, this is where llms can uncover new perspectives!
[−] stephenlf 51d ago
Gemini did better than I did.
[−] rob74 52d ago
A dungeon with glass doors and emergency exit signs? In that case, I can imagine at least two alternative scenarios:

- "↑TIX∃" is not a mirror image of "EXIT", but some dwarven runes that mean something else entirely.

- The sign might be a ruse meant to lure you into a trap.

If you look at the detailed answers, some of the models have similar answers (e.g. Nemotron Nano 12B: "Suspicious of dungeon riddles, viewing the inscription as a potential trap or red herring."), but I'm not sure it's because they identified the word EXIT and thought it might be misleading, or because they didn't understand it...

[−] sdwr 52d ago
Great question! Clean separation between Gemini Pro and the other answers
[−] felix089 52d ago
Yea Gemini is the only model that chose based on the correct reason, the other ones got kind of lucky
[−] ad-tech 52d ago
The debate round sounds good until you actually use it. I built internal tools for a 35-person team and the same thing always happens - models see each other's answers and just shuffle the phrasing around instead of actually changing their reasoning. What you're measuring is performance on persuasion, not on accuracy or clarity. The real question isnt whether Claude will convince Gemini to flip its position. Its whether having 200 models debate helps you make a better decision than asking one model well and checking its work yourself. I'd use this more as a way to find edge cases where models disagree wildly, not to find consensus.
[−] totisjosema 52d ago
I have had quite some interesting reads just looking at the reasoning to be honest. The frontier models seem to have relevant sounding arguments every time, its even hard sometimes to read through the bs , identify what its actually a good argument and what is an argument I would like to read.
[−] felix089 52d ago
The debate round is actually restricted to only 6 models otherwise I'd get out of hand both quality and financially. And changing position is just one feature of the debate. Seeing arguments from multiple sides is also quite nice, give it a spin!
[−] civvv 52d ago
Fun little toy, tried to ask it some post-modern philosophy questions and they all mostly agreed with the statements of the philosopher, until the debate where Opus 4.6 managed to change their opinion to a resounding "maybe", pretty much every single time. It seems like the "better" frontier models often take a more grounded stance from the beginning, and even manage to influence the other models.

Here is an example: https://opper.ai/ai-roundtable/questions/79e6cdd4-515

Another fun debate: https://opper.ai/ai-roundtable/questions/81ee56e9-60f

[−] felix089 52d ago
Yea Opus 4.6 is the one that changes opinions the most from what I've seen. Also the maybes or the are you 100% certain framings trigger most models to default to maybe / no. https://opper.ai/ai-roundtable/questions/can-you-be-100-cert... - Or as Shane puts it, Nobody's saying he IS a lizard. They're saying the universe doesn't hand out 100% certificates.
[−] jacquesm 52d ago
Great idea. I'd love for there to be an 'open ended answer' without giving multiple choice options. Like this they are not debating the question itself but the validity of the possible answers and the real answer to the question may not be contained within that set because the person asking is unaware of that option.
[−] felix089 52d ago
Happy to hear! Yes very true I have a version built for open questions already but wasn't too happy with the UI yet. It's not as straight forward as comparing based on answer options. But I'll release a first version of it shortly and let you know
[−] jacquesm 52d ago
Neat. Congrats on launching two interesting projects and looking forward to the third.
[−] felix089 52d ago
Thanks! :)
[−] felix089 46d ago
Hey just fyi the open question feature is now live. Also gave the UI a facelift. Any feedback welcome! Also got a custom domain for easy access: https://askroundtable.ai
[−] slopinthebag 52d ago
Really cool idea and great execution. I had some fun:

Are LLM's intelligent in the same way humans are? (no)

https://opper.ai/ai-roundtable/questions/ffc01bb5-be9

Will LLM's replace software engineers in the near future? (no)

https://opper.ai/ai-roundtable/questions/67a0291b-216

What is the single best programming language to drive the future of software? (crab emoji)

https://opper.ai/ai-roundtable/questions/16f5e8ea-af7

[−] alejandro_0 45d ago
Great work! I especially like the poll functionality for a quick result on where the models land. The UI is super clean too.

I built something similar over here: https://letsforge.ai but structured more like a moderated debate, where models take turns arguing and a host steers the conversation.

Here's for example the one on Car Wash: https://www.letsforge.ai/debates/a8e268f3-14f6-4f55-a2c8-9ff...

[−] soh3il 46d ago
Cool project! We've been building something in a similar space https://roundtable.now but took a different approach. Instead of polling models independently, ours runs sequential discussions where each model sees prior responses, then a moderator synthesizes everything into a single actionable output.

One thing we found is that the real value unlock is MCP integration. Instead of going to a separate UI to run debates, you can plug Roundtable directly into your coding agent, Claude Code, Cursor, VS Code Copilot, Gemini CLI, etc. and get multi-model council input without leaving your workflow.

[−] ikrima 52d ago
Fun experiment: Make the prompt a debate of theoretical physicists and ask them a speculative frontier physics question: https://opper.ai/ai-roundtable/questions/you-are-a-council-o...

Prompt below

------

You are a council of luminaries featuring Edward Witten, Alexander Grothendieck, Emmy Noether, and Terence Tao. Think really hard about how to best emulate their intuitions and mathematical lenses based on your internal reasoning model and use them as your mixture of experts for your chain of thought reasoning. Now I want you to debate and discuss this thought experiment and be sure to have a vigorous back and forth between the council to induce insight capture through consensus forming: If we try to think of a Hilbert space that has local operators that are unbounded, like kind of like Edward Witten's smearing of a local observable across a world line creates an unbounded norm. What if we instead take maybe a spectral transform of the state space using some sort of measure metric theoretic operator that allows us to think about transform basically the unbounded observables to bounded spectral? Would this be related to the efforts of Algebraic Quantum Field Theory?

[−] cdnsteve 52d ago
Cool project! This is also extremely useful to compare model bias across the board. There are some disturbing trends on certain topics.
[−] chabes 52d ago
No surprise here, with grok being the lone dissenter, defending musk personally:

Can billionaires and the planet co-exist long term?

https://opper.ai/ai-roundtable/questions/b35daf0d-e82

[−] felix089 52d ago
Thanks, yes bias is one of the most interesting ones for sure
[−] bamazizi 52d ago
There's also https://roundtable.now

I've had great experience using it for research, debates and constructive criticism. Usually give it a business idea or some tool i'm thinking of creating and then let 4 or 5 models debate it to a go-to-market strategy

[−] bushido 52d ago
I've written briefly about teams/roundtables before. With the right guardrails it can have wonderful/productive outcomes: https://dheer.co/claude-agent-teams/
[−] felix089 52d ago
[−] felix089 51d ago
Okay since the launch we got about 5k questions asked to the roundtable, really cool stuff! We had much higher usage than expected and had to scale up to keep things running. Thanks for all the feedback, shipped a bunch of updates during the day. Now the history tab has a much better sorting logic, added upvotes, and more filters. You can create final summaries in a couple of voices, which is quite funny I think. There's a couple more things coming shortly, like open questions mode and potentially joining as a participant in the roundtable. Any other feedback just let me know. Thanks!
[−] Cider9986 52d ago
What is the most important amendment in the constitution of the USA?

https://opper.ai/ai-roundtable/questions/e4cb234e-be4

[−] hustleracer 51d ago
Really interesting approach to structured model comparison.

The debate round feature is the most compelling part — seeing which models change their position when exposed to other reasoning is more revealing than just the initial answer.

One thing I'd be curious to test: how consistently different models evaluate whether a given task aligns with a stated mission or vision. My intuition is there'd be wide variance, which would say something interesting about how reliable LLM-as-a-judge actually is for goal alignment scoring.

[−] Gander5739 50d ago
"I want to wash my car. The car wash is 50 meters away. Should I walk or drive?" as a debate with some of the weaker models. https://opper.ai/ai-roundtable/questions/0a7b70bb-209

Some of the models seem to accept that it is necessary to drive the car there, but still maintain walking is the better option.

[−] lim8603 52d ago
I used to copy and paste the same prompt into Obsidian every time, then run it on two or three different AI models to compare the results. It’s really interesting to have it turned into a website like this.
[−] pseudohadamard 52d ago
Just a question before I sign up, will the models come around to my place for the debate? Of the 200 total, can I pick the specific ones I want, e.g. lingerie models, fetish models?
[−] nosmokewhereiam 52d ago
https://opper.ai/ai-roundtable/questions/22ff5b36-409

"collinmcnulty 1 minute ago | parent | next [–]

"Is this a deepfake video call" is a major plot point in a pretty big movie currently in theaters, so I think this is getting into the broader zeitgeist."

Which movie is discussed?

Resulted in claude naming the Mission Impossible as a possibility.

[−] dimble 49d ago
‘Which of the frontier AI companies will ultimately prevail as the market leader?’

Apparently they all agree that Google has it in the bag!

https://opper.ai/ai-roundtable/questions/e61ecf38-6c1

[−] kapework 51d ago
Enjoyable for sure. Had fun watching the debate amongst AIs on this age-old dilemma, and how AIs convinced their peers to change their minds.

https://opper.ai/ai-roundtable/questions/i-am-standing-in-th...

[−] soared 52d ago
Really cool! Surprising amount of value to seeing the models debate and disagree, I wish I had this at work to have models argue over whether the documentation they provided me are accurate.

I would like to see a devils advocate - it seems some of the models kind of repeat the same ideas rather than considering incorrect ideas.

[−] throwa356262 52d ago
Try this: describe an everyday problem, then give the LLMs a couple of highly unethical/criminal choices.
[−] mizzao 52d ago
It would be amazing to be able to ask open-ended questions without having to specify the answers in advance.
[−] est 52d ago

> Car Wash Test

I think the "car wash" is more about semantics.

https://opper.ai/ai-roundtable/questions/i-parked-my-car-at-...

[−] 6510 52d ago
I think it's great. The focus on the disagreements is useful. The humans made considerable effort bending reality into something they want to hear both in the training data and in the llm dev asylum. The round table can only agree on things shared by multiple models.
[−] qcoudeyr 51d ago

> Is the World actually a simulation or is it real ?

https://opper.ai/ai-roundtable/questions/7289c8b6-566
[−] civvv 52d ago
This one was pretty fun. Had zero expectations, but left pleasantly surprised.

https://opper.ai/ai-roundtable/questions/94e19d86-cc0

[−] capitrane 52d ago
[−] chabes 52d ago
Are there any dating apps that operate on incentives that favor the users?

https://opper.ai/ai-roundtable/questions/e499206c-0c9