Is China’s Largest Personal Information Leak Due to Negligence or Intentional?

Recently, the largest personal data leak in history reportedly occurred in China, exposing a staggering 4 billion records, including the personal information of 800 million WeChat users. This breach gave outsiders a glimpse into the massive scale of data collection by the Chinese Communist Party. Cyber experts believe that this leak, whether it was a result of negligence or intentional, raises serious concerns.

On June 10th, Cybernews, an independent media organization specializing in data and internet security, revealed that on May 19th, China experienced what could be “the largest personal data leak in history,” with an unsecured database of a massive 631GB exposing an incredible 4 billion records.

According to the report, SecurityDiscovery.com, a security advisory company specializing in identifying and reporting data security vulnerabilities and leaks, collaborated with Cybernews. They managed to review 16 datasets from this leaked database, each containing records ranging from 500,000 to 800 million.

These datasets are a collection of various types of data, including information on WeChat and Alipay users, containing personal identification numbers, names, phone numbers, addresses, financial information, and other sensitive personal data. One dataset named “tw_db” even included information related to Taiwan.

The largest of these 16 datasets, “wechatid_db,” contains over 8.05 billion records, consisting of ID data of WeChat users.

The second-largest dataset, “address_db,” contains over 7.8 billion records, comprising address data with geolocation codes.

The third-largest dataset, “bank,” includes over 6.3 billion financial data records, including payment card numbers, birthdates, names, and phone numbers.

The fourth-largest dataset, named in Chinese, likely as “三因素核查,” contains over 6.1 billion records, mainly involving identification numbers, phone numbers, and usernames.

The fifth-largest dataset, “wechatinfo,” holds nearly 5.77 billion records, comprising information of WeChat users beyond ID information, including metadata, communication logs, and even user conversations.

The sixth-largest dataset, “zfbkt_db,” consists of 300 million records, containing information on Alipay cards and tokens.

There is also a smaller dataset containing 20 million records related to financial data of Alipay users.

The remaining 9 datasets altogether contain over 353 million records, covering diverse categories such as gambling, vehicle registration, employment information, pensions, and insurance.

The report notes that these datasets represent only a small portion of a larger database that Cybernews team caught a glimpse of as the database was quickly shut down.

The repercussions of such a massive personal data leak range from large-scale phishing, ransomware, and fraudulent activities to state-supported intelligence collection and dissemination of false information.

The report highlights that for the users whose data has been exposed, this incident could mean a disaster. However, due to the anonymous nature of the database and the lack of channels to notify victims, the affected individuals are left completely helpless.

According to Cybernews, their team could not attribute this data to any identifiable organization due to the absence of source labels indicating ownership in the limited time they had.

However, Cybernews stated that “the dataset was crafted very carefully and meticulously, intending to create a comprehensive behavioral, economic, and social profile for almost all Chinese citizens.”

“It takes time and effort to collect and maintain such databases, often associated with threat actors, governments, or very aggressive researchers,” Cybernews added.

A senior industry expert in network security, who preferred to remain anonymous, believed that the sheer scale of this database suggests it would require the forces of the “national team” to accomplish. He speculated that this might be a collaboration project between the Chinese government’s intelligence, public security, or national security system and major big data units, such as Tencent, the parent company of WeChat, and some prominent cloud platforms possibly involved. In other words, it would take the “national team” of the Chinese Communist Party combined with outsourcing to create such a database.

Cybernews’ report revealed that the massive database was open without a password on May 19th, allowing public access, before being shut down on May 20th, restricting public access. The report did not specify how long this “open door” state lasted.

But why did the database end up in this vulnerable “open door” state, enabling the leak?

Cybernews did not address this question in their report.

The senior industry expert contacted by reporters suggested that the reasons for this information leakage could be negligence or intentional actions.

He pointed out that because this vast database likely involved the Chinese government’s “national team” outsourcing to third-party contractors and even further subcontractors, security certifications could become lax as the chain of contractors lengthens, leading to security vulnerabilities.

Another possibility was that the database lack protective mechanisms during internal testing, resulting in unintentional data exposure.

Regarding intentional actions, the industry expert mentioned that one party involved in the database might collude with the underground black market, engaging in data trading, a common practice in mainland China.

He also noted that someone in the database’s chain of interests might be concerned about being scapegoated in the future, so they could have created a “backdoor” or an exit strategy for themselves in advance.

To date, there has been no official response from the Chinese Communist Party regarding this massive personal data leak incident.

Data breaches involving personal information in China have been recurring. Cybernews cited previous reports on a leak of 1.5 billion records from Weibo, Didi Chuxing, and the Shanghai Communist Party, a leak of 1.2 billion records of Chinese user data by a mysterious organization, and the recent leak of records of 62 million iPhone users by hackers.

Internet users have previously mocked the Chinese Communist Party for claiming to protect citizens’ privacy while being the largest violator and leaker of personal information.

The industry expert mentioned that if this 4 billion data leak incident continues to escalate and puts pressure on the Chinese authorities, their usual approach would be to shift the blame, saying it was done by “temporary workers” or attributing it to an “unforeseeable event that caused a system restart,” to evade responsibility.

Cybernews did not disclose how their team discovered the massive database in the vulnerable “open door” state during which time window.

Based on publicly available information, Cybernews did not reveal its headquarters location, but it is likely in a country in Eastern Europe. They describe their investigative team as using “white-hat” hacker techniques to discover and securely disclose network security threats and vulnerabilities.

The industry expert found it “very strange” how an unknown foreign media outlet could be the first to know about this massive database leak.

He speculated that there might be many hidden internal stories, suggesting that there could be whistleblowers within, creating a sort of insurance policy for themselves, leaving clues for future potential betrayals to avoid being scapegoated.

He explained it as a boomerang effect, a self-defeating phenomenon, within the surveillance and monitoring industry of the Chinese Communist Party, indicating that not everyone within this industry chain staunchly supports the Party’s actions, with many individuals concerned about being exposed and scapegoated, leading them to proactively disclose what they know.

Once someone starts to disclose information, everyone becomes wary of potential betrayal, thus under this “sense of danger,” they instinctively seek help, potentially finding overseas channels to hand over their “insurance,” leaving themselves a way out.

The industry expert emphasized that when big data is used for monitoring the people rather than serving them, the legitimacy of the Chinese Communist Party’s governance comes into question, with technical vulnerabilities becoming a risk for their political “dam bursting.”

He suggested that what goes around comes around, and the Party’s tactics might eventually backfire, as their misconduct may not go unpunished.