You type a question, define answer options, pick up to 50 models at a time from a pool of 200+, and they all answer independently under identical conditions. No system prompt, structured output, same setup for every model.
You can also run a debate round where models see each other's reasoning and get a chance to change their minds. A reviewer model then summarizes the full transcript. All models are routed via my startup Opper. Any feedback is welcome!
Hope you enjoy it, and would love to hear what you think!
https://opper.ai/ai-roundtable/questions/8f5b4f55-617
Do you think its alright that AI labs scraped the internet without respect for copyright and now sell closed models?
https://opper.ai/ai-roundtable/questions/86864de8-251
Very interesting to read the transcripts. And seeing how they manage to convince each other. Opus 4.6 seems to really get the others changing their minds
https://opper.ai/ai-roundtable/questions/you-are-standing-in...
This is quite impressive, really.
- "↑TIX∃" is not a mirror image of "EXIT", but some dwarven runes that mean something else entirely.
- The sign might be a ruse meant to lure you into a trap.
If you look at the detailed answers, some of the models have similar answers (e.g. Nemotron Nano 12B: "Suspicious of dungeon riddles, viewing the inscription as a potential trap or red herring."), but I'm not sure it's because they identified the word EXIT and thought it might be misleading, or because they didn't understand it...
https://opper.ai/ai-roundtable/questions/i-am-standing-in-th...
Here is an example: https://opper.ai/ai-roundtable/questions/79e6cdd4-515
Another fun debate: https://opper.ai/ai-roundtable/questions/81ee56e9-60f
Prompt below
------
You are a council of luminaries featuring Edward Witten, Alexander Grothendieck, Emmy Noether, and Terence Tao. Think really hard about how to best emulate their intuitions and mathematical lenses based on your internal reasoning model and use them as your mixture of experts for your chain of thought reasoning. Now I want you to debate and discuss this thought experiment and be sure to have a vigorous back and forth between the council to induce insight capture through consensus forming: If we try to think of a Hilbert space that has local operators that are unbounded, like kind of like Edward Witten's smearing of a local observable across a world line creates an unbounded norm. What if we instead take maybe a spectral transform of the state space using some sort of measure metric theoretic operator that allows us to think about transform basically the unbounded observables to bounded spectral? Would this be related to the efforts of Algebraic Quantum Field Theory?
I've had great experience using it for research, debates and constructive criticism. Usually give it a business idea or some tool i'm thinking of creating and then let 4 or 5 models debate it to a go-to-market strategy
Why are you recommending something so sketchy?
"collinmcnulty 1 minute ago | parent | next [–]
"Is this a deepfake video call" is a major plot point in a pretty big movie currently in theaters, so I think this is getting into the broader zeitgeist."
Which movie is discussed?
Resulted in claude naming the Mission Impossible as a possibility.
Can billionaires and the planet co-exist long term?
I think the "car wash" is more about semantics.
https://opper.ai/ai-roundtable/questions/i-parked-my-car-at-...
I would like to see a devils advocate - it seems some of the models kind of repeat the same ideas rather than considering incorrect ideas.
You can self-host as well, but not via desktop app. Sever setup required.
Be careful of your token context, you can easily rack up costs if you leave Opus selected as the model and get lost in some rabbit hole of results.
Enjoy enjoy!
btw what does it mean
> 'any' in the prompt was satisfied by both casual-alignment and niche boutique models.
It would be cool if the human user could be a participant in the debate, getting a vote and the chance to state their reasoning.
Are LLM's intelligent in the same way humans are? (no)
https://opper.ai/ai-roundtable/questions/ffc01bb5-be9
Will LLM's replace software engineers in the near future? (no)
https://opper.ai/ai-roundtable/questions/67a0291b-216
What is the single best programming language to drive the future of software? (crab emoji)
I'll give sonnet another go.
It would be nice to support collections of claims, with a table of summaries. I would love to list out a few dozen phony concepts from school, and have a sharable chart of the rejections, that expand.
I really like the UI. It's nice to read the expanded results.
But how do you afford the tokens?
What year is it?