Player Live
AO VIVO
23 de junho de 2026
Alibaba's AI video model rises to No. 2 in global rankings, as OpenAI's Sora and ByteDance's Seedance fall away

Alibaba's AI video model rises to No. 2 in global rankings, as OpenAI's Sora and ByteDance's Seedance fall away

Alibaba Cloud on Sunday released HappyHorse 1.1, a major upgrade to its AI video generation model that the company says delivers production-ready video synthesis across core content creation scenarios. The model is now live on Alibaba Cloud Model Studio with full API access for enterprise customers and developers, accompanied by a 40% sitewide launch discount for the first two weeks. The release arrives at a moment of remarkable upheaval in the AI video generation market — and Alibaba appears keenly aware of the timing. OpenAI discontinued Sora after it proved financially unsustainable. ByteDance indefinitely shelved the international rollout of Seedance 2.0 following a barrage of copyright complaints from Hollywood studios. For enterprise procurement teams that had been evaluating or integrating those tools into marketing, advertising, and content production workflows, the competitive landscape has contracted sharply in a matter of months. That contraction creates both an opportunity and a test for Alibaba. HappyHorse 1.1 is not a research demo or a consumer toy — it is an API-first product built for integration into enterprise software stacks, priced for volume, and backed by a $52.7 billion global infrastructure buildout. Whether it can convert technical capability into enterprise adoption, particularly in Western markets navigating intensifying U.S.-China tech tensions, will determine whether Alibaba can establish itself as a serious player in the generative video market that analysts expect to reach tens of billions of dollars by the end of the decade. How HappyHorse climbed from anonymous benchmark entry to top-ranked video model HappyHorse first appeared in early April as an anonymous submission on the Artificial Analysis Video Arena, an independent benchmarking platform where real users compare model outputs in blind, side-by-side evaluations. The model immediately claimed the top position in both text-to-video and image-to-video rankings. Alibaba was subsequently confirmed as the creator, revealing it was built by the company's ATH (Alibaba Token Hub) AI Innovation Unit — a team previously part of the Future Life Lab under the Taobao and Tmall Group before a strategic organizational restructuring. According to Arena.ai, HappyHorse 1.0 now holds the No. 2 position across all three Video Arena leaderboards. The platform noted the model scores 1,444 in both text-to-video and image-to-video categories, leading Google's Veo-3.1 (with audio) by 69 points in text-to-video and xAI's Grok-Imagine-Video by 23 points in image-to-video. In Elo-based ranking systems like Arena's, models gain or lose points based on whether users prefer their outputs in head-to-head comparisons, meaning persistent double-digit leads reflect a consistent quality gap as perceived by human evaluators — not a statistical fluke. The model's architecture helps explain why. According to community-compiled technical documentation, HappyHorse is built around a 15-billion-parameter unified self-attention Transformer that processes text, image, video, and audio tokens within a single token sequence. Unlike many competitors that stitch together separate models for video and audio, HappyHorse operates as a unified system that handles all modalities in a single generation pass, eliminating the need for third-party dubbing or post-processing audio tools. For enterprise buyers evaluating total cost of ownership, that architectural simplicity translates directly into fewer integration points, fewer vendor dependencies, and faster time to production. What the 1.1 upgrade fixes — and why it matters for commercial video production The 1.1 upgrade targets a set of pain points that enterprise video production teams know intimately. Alibaba Cloud described the release as "systematically optimized across core content generation scenarios," and the specific improvements reveal a model that has been tuned for commercial deployment rather than viral social media demos. The most consequential upgrade is multi-image reference capability, which Alibaba calls R2V (Reference-to-Video). The feature allows users to upload multiple character reference images and maintain consistent identity across generated video — directly addressing one of the hardest problems in AI video production, where subjects tend to drift in appearance between frames or shots. For brands producing advertising campaigns, product videos, or serialized marketing content, identity consistency is not a nice-to-have; it is a requirement that has historically forced teams back to traditional production methods. Motion quality receives a significant overhaul, with what Alibaba describes as "strengthened motion modeling" that addresses prior limitations in speed and fluidity. The company also made targeted improvements to visual texture, specifically calling out the elimination of "facial oiliness," "over-sharpening," and "unnatural textures" — artifacts that have plagued commercial AI video since the technology emerged and that immediately signal to viewers that content is machine-generated. Two additional upgrades round out the release. HappyHorse 1.1 improves audio-visual synchronization, including what Alibaba claims is "zero-drift lip sync" for dialogue scenes and context-aware speech pacing — building on the 1.0 version's already notable ability to generate up to 15 seconds of 1080p video with synchronized audio output. The model also improves instruction-following for long and complex prompts, a critical differentiator for enterprise users who need to specify precise camera movements, lighting conditions, and narrative beats in a single generation pass rather than iterating through dozens of attempts. Sora's collapse and Seedance's freeze leave enterprise buyers with fewer choices than ever The competitive context surrounding this launch is unusually favorable for Alibaba, and it is worth understanding why. OpenAI's Sora web and app experiences were discontinued on April 26, with the Sora API set to follow on September 24. The shutdown came after the product proved financially untenable: Sora cost roughly $1 million per day to operate but generated only about $2.1 million in total revenue, while active users dropped from a peak near 1 million to under 500,000. For enterprise teams that had integrated Sora into production pipelines, the abrupt withdrawal underscored the risks of depending on AI products that lack a sustainable business model — a cautionary tale that procurement officers are unlikely to forget quickly. ByteDance's Seedance 2.0, which many considered Sora's most formidable successor, ran into a different kind of wall. Netflix, Warner Bros., Disney, Paramount, and Sony sent legal threats to ByteDance over allegations of systematic copyright infringement after users generated viral clips featuring Hollywood intellectual property. ByteDance indefinitely postponed the international launch, and the

Leia Mais »