Market Share Decline, Traffic Plummet, User Loss in DeepSeek

Recently, a report from the artificial intelligence research company SemiAnalysis revealed that the Chinese AI company DeepSeek is facing a decline in user retention and official website traffic, leading to a continuous decrease in market share.

The report indicated that due to limited computational power, DeepSeek has sacrificed performance in at least two aspects while providing low prices, resulting in a decrease in user experience and loss of users.

According to the research report released by SemiAnalysis on July 3, after the release of the DeepSeek R1 model, user traffic surged, reaching a peak market share of around 7.5%. However, this share began to decline later on, dropping to about 5% by May. After 128 days, both its traffic and users showed signs of inactivity.

Furthermore, traffic data from the DeepSeek official website worsened, with a 29% decrease in traffic from February to May. Meanwhile, during the same period, other major models such as ChatGPT increased by 40.6%, Anthropic’s Claude grew by 36.5%, Google’s Gemini by 85.6%, and Musk’s Grok by 247.1%.

The data showed that the portion of DeepSeek’s Token traffic self-hosted by the company has been showing a monthly decline, dropping from 42% in March to 16% in May.

Despite the sluggish growth of users for DeepSeek’s self-hosted models, the total usage of its inference model R1 and universal model V3 on third-party hosting platforms has been steadily increasing.

In the field of artificial intelligence, “Token” usually refers to “token,” which is used to represent the smallest unit or basic element for processing text during language processing. A Token can be a single character or a sequence of characters.

While DeepSeek’s large models on third-party hosting platforms did not experience a decline, why are users shifting away from DeepSeek’s own official website towards other open-source providers?

The report pointed out that DeepSeek sacrificed performance in at least two aspects while providing low prices, leading to user attrition.

Firstly, the Time-to-First-Token. DeepSeek forces users to wait for several seconds before the model outputs the first token. For a mere $2-4, users can obtain nearly zero latency experience from third-party hosting platforms like Parasail or Friendli. Additionally, despite Microsoft Azure’s service being 2.5 times more expensive than DeepSeek, it reduces delays by a significant 25 seconds.

Moreover, DeepSeek’s use of a 64K context window is among the smallest among mainstream model providers, restricting scenarios like programming due to the smaller context window. With the same price, users can get over 2.5 times the context window size from services like Lambda and Nebius.

The Context Window refers to the number of tokens that the model’s “short-term memory” can hold before “forgetting” early parts of the conversation and clearing old tokens.

Benchmark tests on DeepSeek V3 model on AMD and NVIDIA chips showed that model providers lower the total cost per unit token by processing more user requests simultaneously on a single GPU or GPU cluster (known as “batch processing”).

This implies that users have to endure higher latency and slower throughput, drastically decreasing user experience.

Why doesn’t DeepSeek care about user experience?

The report highlighted that by adopting extremely high batch processing, DeepSeek can minimize computational resource consumption for model inference and external services, thereby retaining as much computing power as possible internally for research and development purposes. DeepSeek’s focus is on AGI (Artificial General Intelligence) rather than making money from end-users.

On April 15, the U.S. Department of Commerce issued a statement requiring new export permits for NVIDIA’s H20 chips and AMD’s MI308 chips, which were previously exportable to China. This is essentially a ban on the chips being sold to China.

The H20 chip is an “attenuated” product specifically designed by NVIDIA for the Chinese market to comply with U.S. export control regulations – regulations that have already barred the sale of their most advanced chips to China. DeepSeek’s models heavily rely on the H20 chips, although they are not NVIDIA’s most powerful chips.

Since DeepSeek launched its AI applications in January of this year, its security policies and data protection have increasingly come under scrutiny internationally. So far, several countries including the United States and Germany have banned the use of DeepSeek applications in government equipment.