The rising cost of calling large models has spurred the development of intermediary services such as token relays. These services typically connect to models like GPT, Claude, Gemini, and DeepSeek through a unified interface, and then distribute call quotas to developers, SMEs, and individual users at lower prices.
How is the low-price model formed?
The appeal of these services stems primarily from two aspects. First, users don't need to purchase multiple model services separately; recharging and invocation are more centralized. Second, some intermediary stations can obtain interface costs lower than the official retail price, making them more competitive in terms of pricing.
The report mentions that some intermediary stations access the model interface through primary agents, while others use corporate support quotas, bulk accounts, or shared memberships to reduce costs. The technical barrier is not high; existing frameworks can be used to build on them, thus attracting a large number of individuals and small teams to enter the market in a short period.
Narrowing profits and increasing website shutdowns
Some platforms will continue to develop downstream agents, profiting from price differences and commissions. In the early stages of rapid demand growth, some people did indeed earn high incomes by relying on distribution, private domain traffic, and supporting training services. However, as the number of participants increases, the profit margin solely based on token price differences has significantly narrowed.
The article states that competition in the current market is intensifying, and profit margins through legitimate channels have declined significantly. Smaller websites lacking stable traffic and customer sources are finding it even harder to maintain operations, with some shutting down shortly after launch.
Data and compliance issues emerge
At the same time, price competition has also increased the proportion of gray-area operations. Some merchants reduce costs by "watering down" the model, that is, by substituting a cheaper model for the target model purchased by the user in the background, or by limiting the context length, memory window, and network capabilities, thereby cutting expenses.
Besides service quality, user data security is also a prominent issue. Multiple users sharing accounts and backend data transfer mechanisms can lead to the leakage of files, chat logs, or links between different users. The report also mentioned that some industry professionals have been asked whether they sell user data, reflecting a real risk in this area.
Compliance risks are also rising. The article cites cases where transit station operators have been criminally prosecuted for reselling overseas large-scale model interfaces through overseas servers. If the business model involves circumventing normal cross-border network and service restrictions, operators may face even higher legal risks.
Overall, token intermediaries have lowered the barrier to entry for large-scale models and met the needs of small and medium-sized users for low-cost, multi-model access. However, amidst the price war, service authenticity, account stability, data security, and compliance issues are becoming more prominent pressure points for the industry.










