{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreifq5savdh3lvwyxkh2acqs7i2gy74jaxpinp4ysr6b2rxnadk72wm",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkia2jqkyiu2"
  },
  "path": "/t/is-happyhorse-1-0-the-first-video-api-that-feels-production-ready/175589#post_1",
  "publishedAt": "2026-04-27T12:30:50.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "EvoLink"
  ],
  "textContent": "Alibaba just opened public API access for HappyHorse 1.0, the model currently ranked #1 on Video Arena’s blind tests.\n\nWhat caught my attention isn’t only the ranking. It’s the shape of the API:\n\n  * text-to-video\n  * image-to-video\n  * reference-to-video with up to 9 refs\n  * natural-language video editing\n\n\n\nAnd the pricing is simple enough to reason about: 0.9 RMB/sec at 720P, 1.6 RMB/sec at 1080P.\n\nWhat I’m wondering is this:\n\n**Does controllability matter more than raw quality now?**\n\nThe launch examples suggest HappyHorse is very good at following camera language and multi-shot structure. That feels more important for real products than one-off impressive outputs.\n\nFor example:\n\n  * “Shot 1 / Shot 2 / Shot 3” prompt structure\n  * explicit camera motion\n  * style-first prompts for anime or cinematic looks\n  * image-to-video workflows that preserve subject identity better than pure t2v\n\n\n\nThat sounds like a model designed for pipelines, not just experimentation.\n\nI’m also curious whether teams here would prefer:\n\n  1. one strong general video endpoint, or\n  2. separate endpoints like HappyHorse has for t2v / i2v / r2v / edit\n\n\n\nYou can try it via EvoLink.\n\nMy current take: splitting the API by workflow is the right call. It reduces ambiguity and makes production integration easier.",
  "title": "Is HappyHorse 1.0 the first video API that feels production-ready?"
}