Agentuity Documentation

Model Arena

Docs

Compare outputs from different AI models by using another AI as the judge. Generate content from multiple providers in parallel via the AI Gateway, then have a judge model score them on criteria you define: creativity, accuracy, tone, or whatever matters for your use case. Useful for comparing models or testing different prompts.

Competitors
OpenAI/gpt-5-nanoAnthropic/claude-haiku-4-5
Judge
Groq/gpt-oss-120b
Prompt
A robot discovers it can dream
Tonesci-fi
Reference Code
Loading...
Ready
Output will appear here...