Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreifd2nihdtxztok4qbsp6hza37yxxaqqrdyqvraiqrbsvdvam3q24i",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mmur7qegolq2"
  },
  "path": "/t/built-a-multi-model-ai-playground-at-16-openai-models-are-the-backbone/1381908#post_1",
  "publishedAt": "2026-05-28T00:07:57.000Z",
  "site": "https://community.openai.com",
  "textContent": "Hey everyone,\n\nI’ve been learning the OpenAI API for the past few months and ended up building something bigger than I expected — a web app called Nova that lets you test and compare different AI models side by side, including GPT-5.5 and other models.\n\nThe idea started because I wanted to understand how different models handle the same prompts. Like, how does GPT-5.5 compare to Gemini on coding tasks? How does Claude handle creative writing vs GPT? Instead of switching between 5 different websites, I built one interface to test them all.\n\nWhat it does:\n\n  * Switch between 30+ models mid-conversation (GPT-5.5 is the default — it’s just the best for most things)\n  * Built-in image generation (GPT-image-2 + others)\n  * Vision/image analysis\n  * Web search for real-time answers\n  * Chat history so you can go back and compare responses\n\n\n\nTech details for anyone curious:\n\n  * OpenAI API via OpenRouter for model routing\n  * FastAPI backend, vanilla JS frontend\n  * SQLite for chat persistence\n  * Runs on a free-tier GCP VM\n  * Built the whole thing solo\n\n\n\nThe coolest thing I learned: GPT-5.5 consistently outperforms most other models on multi-step reasoning. It’s not even close on complex tasks.\n\nI’m a student and still learning — would love feedback from this community, especially on how I’m using the API. Any tips on optimizing token usage or improving streaming performance?",
  "title": "Built a multi-model AI playground at 16 — OpenAI models are the backbone"
}