{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreifq5savdh3lvwyxkh2acqs7i2gy74jaxpinp4ysr6b2rxnadk72wm",
"uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkia2jqkyiu2"
},
"path": "/t/is-happyhorse-1-0-the-first-video-api-that-feels-production-ready/175589#post_1",
"publishedAt": "2026-04-27T12:30:50.000Z",
"site": "https://discuss.huggingface.co",
"tags": [
"EvoLink"
],
"textContent": "Alibaba just opened public API access for HappyHorse 1.0, the model currently ranked #1 on Video Arena’s blind tests.\n\nWhat caught my attention isn’t only the ranking. It’s the shape of the API:\n\n * text-to-video\n * image-to-video\n * reference-to-video with up to 9 refs\n * natural-language video editing\n\n\n\nAnd the pricing is simple enough to reason about: 0.9 RMB/sec at 720P, 1.6 RMB/sec at 1080P.\n\nWhat I’m wondering is this:\n\n**Does controllability matter more than raw quality now?**\n\nThe launch examples suggest HappyHorse is very good at following camera language and multi-shot structure. That feels more important for real products than one-off impressive outputs.\n\nFor example:\n\n * “Shot 1 / Shot 2 / Shot 3” prompt structure\n * explicit camera motion\n * style-first prompts for anime or cinematic looks\n * image-to-video workflows that preserve subject identity better than pure t2v\n\n\n\nThat sounds like a model designed for pipelines, not just experimentation.\n\nI’m also curious whether teams here would prefer:\n\n 1. one strong general video endpoint, or\n 2. separate endpoints like HappyHorse has for t2v / i2v / r2v / edit\n\n\n\nYou can try it via EvoLink.\n\nMy current take: splitting the API by workflow is the right call. It reduces ambiguity and makes production integration easier.",
"title": "Is HappyHorse 1.0 the first video API that feels production-ready?"
}