Author:Wall Street CN
On April 2, Volcano Engine announced two things at its roadshow in Wuhan.
FirstSeedance 2.0 API is now officially open for public beta testing to enterprise users.The official website states that it is "for enterprise users" and charges based on actual usage, no longer requiring a minimum deposit of tens of millions of yuan.
Two is a number. As of March this year,Doubao's large-scale model has achieved a daily token usage exceeding 120 trillion.It has doubled in the past three months and grown 1,000 times since it first began offering enterprise services in May 2024.
The Volcano Ark Experience Center simultaneously launched a new demo of Seedance 2.0, including a video of someone playing the piano that showcased the exquisite level of audio-visual synchronization. The force of the keys falling, the fluidity of the finger movements, and the complete piano sound effects almost reached the standards of professional-grade content.

This is not a routine iteration. The opening of the Seedance 2.0 API, coupled with the figure of 120 trillion, and the enterprise data knowledge management platform released on the same day, all combine to make it clear what Volcano Engine is trying to do now.
The door to "openness" is still ajar.
In February, Seedance 2.0 was launched for closed beta testing, and its multimodal creation methods and built-in camera movement effects quickly garnered significant attention globally. Multiple demo videos on overseas social media platforms surpassed one million views. The sentiment on social media during that period was typical: on one hand, it was touted as "the strongest on earth," while on the other hand, it was seen as a sign that "Hollywood is doomed."
Feng Ji, CEO of Game Science, the developer of "Black Myth: Wukong," commented after trying Seedance 2.0 that it is "the strongest video generation model on the planet."
However, very few people could use it at the time. During the internal testing phase, the entry threshold for Seedance 2.0 was guaranteed to be in the tens of millions. High-level privileges were in the hands of a few top teams, leading to gray-area operations such as group buying and private sub-licensing of interfaces in the market. This situation lasted for nearly two months.
On April 2nd, Seedance 2.0 finally announced its official "opening," but when you rush to the door, you'll find that the threshold for entry is still quite high.
Pay is based on actual usage, and the minimum guarantee of tens of millions of yuan is no longer required. However, newly signed users will only receive 10 concurrent connections by default, and this limit cannot be increased. Advanced features such as real faces and custom virtual avatars are not supported; users can only use the platform's public virtual avatar library for secondary creation. In addition, newly signed teams need to pay a deposit of approximately 1 million yuan, and can only gradually release and use the features after completing the established framework within one year.
This remains a significant hurdle for teams developing high-quality short dramas. The limitations of live-action style, the inability to support mass production with only 10 concurrent users, and the 1 million yuan deposit locks in a considerable number of small and medium-sized teams.
The platform explained that this is due to copyright and portrait rights protection. Seedance 2.0 has established copyright and portrait rights protection, covering all modalities involved in video generation and the entire process before and after creation, to detect and defend against infringement, deepfakes, and other behaviors.
This consideration has a basis in reality: during the closed beta test in February, Seedance 2.0's ability to recreate the original voice from just a facial photo sparked a lot of ethical and data security discussions. In particular, the well-known blogger Tim from FilmHurricane, who participated in the first closed beta test, caused a sensation by posting a video, and the official team immediately suspended the real-person material reference function.
However, the business logic remains clear. Layering by capability and classifying by risk is essentially a user selection process. Institutions with substantial funding and compliance requirements are the enterprise clients that Volcano Engine truly wants to serve. Small and medium-sized teams are not being driven out, but rather being pushed to platforms that have already integrated with Seedance 2.0. For example, Chanmama, a service provider within the ByteDance ecosystem, was among the first to become a "subcontractor." This has led to a complete reorganization of profit distribution within the entire ecosystem.

Compared to Kuaishou Keling, both major competitors in the AI video field, Kuaishou's strategy is relatively more open, its pricing system for individual users is simpler, and its content security approach is relatively broader. Volcano Engine, on the other hand, has chosen a different path: establishing a traceable and accountable access framework on the enterprise side, ensuring that highly sensitive capabilities only flow to high-value, highly compliant clients.
It's too early to tell who's right and who's wrong. But these two strategies represent two completely different business judgments. Kuaishou is betting on a larger creator market, while Volcano Engine is targeting the mass production needs of industry clients. This is determined by their fundamental positioning.
What do 120 trillion tokens mean?
The figure of 120 trillion only has meaning when placed in a coordinate system.
According to a report released by IDC, the number of public cloud tokens invoked in China reached 114.2 trillion in 2024, with Volcano Engine ranking first in the Chinese market with a market share of 46.4%. By the first half of 2025, this share had further increased to 49.2%, meaning that one out of every two tokens on the Chinese public cloud was generated by Volcano Engine.

But a long-standing question remains unanswered: how much of these 120 trillion tokens are consumed internally by ByteDance?
Douyin, Toutiao, CapCut, and Lark—ByteDance's products are themselves massive consumers of tokens. One of the biggest criticisms of Volcano Engine over the past few years has been its high internal revenue share and questionable external monetization. Public data has never broken down the internal and external traffic ratio. A report at the end of last year mentioned that Volcano Engine's internal revenue accounted for about 70%. If this ratio doesn't decrease significantly, the actual scale of external monetization behind the 120 trillion yuan figure may be much smaller than the numbers suggest.

A more valuable statistic might be this: the number of companies with over one trillion tokens used on Volcano Engine has grown from 100 at the end of last year to 140. This represents deeply engaged, large clients, not just occasional trial visits. Adding 40 companies in three months is a significant growth rate. However, in comparison, Alibaba Cloud has a much larger enterprise customer base, and Tencent Cloud has its own stronghold in the financial and government sectors. Volcano Engine's commercialization narrative needs to focus on how many of these 140 companies are actually using a large-scale model to run core businesses, rather than just lingering in the PoC (proof-of-concept) stage.
Alibaba Cloud and Tencent Cloud are not in a good position in the large-scale MaaS market. Alibaba's Tongyi series has far less influence on the consumer side than on the enterprise side, and Tencent's Hunyuan is progressing more slowly. Baidu Wenxin started early, but in the wave of cost reduction and efficiency improvement, price wars have weakened its advantages. Volcano Engine was the first to drive the price of large-scale APIs into the "cent era," reducing the price per thousand tokens to the bottom of the industry in May 2024, after which almost all cloud vendors followed suit.
But as the price war has progressed to this point, it's no longer just about price.
Enterprise knowledge management: another tough nut to crack
On the same day as the Wuhan roadshow, another launch was even more low-key. Xiao Ran, General Manager of Volcano Engine Digital Intelligence Platform Solutions, proposed at the ArkClaw Data Intelligence Special Session that the positioning of AI applications in enterprise scenarios needs to be upgraded from "personal assistant" to "enterprise digital partner," and launched the Volcano Engine Enterprise Data Knowledge Management Platform.
The core logic of this product is to solve the problem of large-scale models being "usable but not user-friendly" within enterprises. General-purpose models don't understand internal company rules, product knowledge, or historical decisions, requiring extensive contextual supplementation for each call, resulting in low efficiency and numerous illusions. What the enterprise data knowledge management platform does is build a company-specific "knowledge foundation" for the model: a three-layered system integrating personal knowledge bases, public knowledge bases, and scenario-based workspaces. This allows employees in different roles to trigger corresponding authorized knowledge when calling the model, instead of starting from scratch each time.
This line represents an extension of the engineering of RAG (Retrieval Augmentation) technology and also a deep-water area for enterprise AI implementation. The challenges faced by large-scale model cloud vendors in this area are: high sensitivity to enterprise data, diverse data formats, and complex permission systems. A poor user experience in any of these areas will lead to non-renewal of contracts.
Compared to consumer products like Seedance 2.0, which can go viral on social media, enterprise data knowledge management is a slower process. However, it offers higher average order value, stronger customer loyalty, and lower churn. This is an area that Volcano Engine needs to improve.
Flags were planted on every floor, but the vertical depth was the real challenge.
By breaking down the operation of the volcano engine, a clear layout diagram can be seen.
Model layer:Doubao's comprehensive model family covers text, vision, video (Seedance), speech, code, and vector, essentially providing full-modal support. Inference service layer: Volcano Ark, holding the number one market share nationwide.
Application layer:ArkClaw (AI programming assistant/Enterprise version of OpenClaw), Kouzi (low-code application building).
Data layer:The enterprise data knowledge management platform launched at the Wuhan station provides a knowledge foundation for ArkClaw, enabling large models to "understand the enterprise's own knowledge assets".
The entire architecture, from the underlying computing power to the upper-layer applications, is theoretically interconnected.
But Alibaba Cloud is also following this path. The Tongyi family of services + Alibaba Cloud Bailian + DingTalk AI + Enterprise Knowledge Base share almost identical logic. Tencent Cloud, relying on WeChat Work and Tencent Meeting, has a natural entry point in collaboration tool scenarios. Huawei Cloud has strong relationships with government and enterprise customers.
Ultimately, the key to success in enterprise software development has never been just technology. It's service capabilities, the industry's ISV (Independent Software Vendor) ecosystem, the stability of private deployments, and the sales team's familiarity with government and enterprise decision-making processes. These are the things that have historically been ByteDance's weakest link.
Volcano Engine has already deeply penetrated the consumer internet and content industry: 90% of mainstream car brands and 80% of top securities firms—these figures are accurate. However, expanding from media and entertainment, and automobiles to manufacturing, healthcare, and government affairs requires organizational capabilities that are quite different from ByteDance's existing DNA.
Today, we're discussing three things together: the Volcano Engine continues to deploy its network at each level. Thoroughly penetrating each point will take time.
This article is sourced from:












