AI and Human Voices Hard to Distinguish in Study, Caution Against Deception

Artificial intelligence (AI) generated voices have long been criticized for sounding unnatural, akin to the voices of virtual assistants like Siri or Alexa. However, a recent study from the United Kingdom has found that AI technology has progressed to the point where its synthesized voices closely resemble human voices, making it difficult for the average person to distinguish between them. This has raised concerns about malicious individuals leveraging AI generated voices for fraudulent activities and other criminal purposes.

Queen Mary University of London in the UK released a press statement last month indicating that AI-generated voices are now prevalent in our daily lives, whether through interactions with Apple’s Siri or Amazon’s Alexa, or engaging with automated customer service systems over the phone.

According to Nadine Lavan, a senior lecturer in psychology at the university, “While these voices may not sound completely human-like yet, the development of AI technology creating natural, human-like voices is just a matter of time. Our research indicates that this time has come, and we urgently need to understand how people perceive these lifelike voices.”

Lavan’s research team compared real human voices with two different types of synthesized voices generated by advanced AI speech synthesis tools. Some of these synthetic voices were “cloned” from recordings of real humans, aiming to mimic human voices, while others were generated by a large-scale language model without mimicking any specific individual.

Participants were presented with 80 different voice samples (40 AI-generated voices and 40 real human voices) and asked to identify which were authentic human voices and which were AI-generated.

On average, only 41% of the voices generated by language models were mistaken for human voices, indicating that in most cases, people can still differentiate between them and real human voices.

However, the majority (58%) of AI voices cloned from real recordings were incorrectly classified as human voices, with only 62% of real human voices correctly identified as such. This led the researchers to conclude that people do not statistically differ in their ability to differentiate between real human voices and AI synthesized voices, making it challenging for the general population to discern between the two.

Lavan noted that her research team easily and quickly created replicas or deepfake voices of real individuals (with their consent) using commercially available software. She stated, “This process requires little to no expertise, just a few minutes of recording, and almost no cost, highlighting how widespread and mature AI speech technology has become.”

The rapid development of AI has had significant ethical, copyright, and security implications, particularly in the realm of misinformation, fraud, and impersonation. Criminals leveraging AI to clone your voice could exploit voice authentication systems at banks or deceive your loved ones into transferring money more effortlessly.

Instances of such fraudulent activities have been reported worldwide. For example, in July this year, Sharon Brightwell from Florida was scammed out of $15,000 by an AI-generated voice.

According to WFLA news, Brightwell received a call from a number that appeared to be her daughter’s. A woman on the call tearfully claimed to be her daughter, stating that she had been in a car accident and her voice sounded remarkably similar to her daughter’s.

The woman explained that she had caused an accident while texting, had her phone confiscated by the police, and a man who claimed to be her daughter’s lawyer subsequently informed Brightwell that her daughter was detained and required $15,000 for bail.

Believing the voice to be her daughter’s, Brightwell followed the payment instructions and sent $15,000. She stated, “Nobody could convince me that was not her. I know the cry of my daughter.”

The scammers later called again, alleging that the pregnant woman involved in the accident had lost her baby and demanded $30,000 in compensation to avoid legal action against her daughter. However, this time around, Brightwell’s family directly contacted her daughter, confirming her safety and exposing the scammers’ lies.

Brightwell’s family suspects that the scammers used videos of her daughter on social media platforms like Facebook to mimic her daughter’s voice and perpetrate the fraud.

Expressing her hope that such incidents won’t happen to others, Brightwell shared, “My husband and I just retired, and that money was our savings.”

She urged people to take preventive measures against such scams, such as using a code word for verifying identity over the phone in emergencies. If the caller cannot provide the correct code word, it is best to hang up immediately.