Advancements in voice AI come with widespread risk to biometrics

Advancements in voice AI come with widespread risk to biometrics
Deepfake voices are already a challenge for authentication systems. But the task is getting tougher, as big players pursue voice AI products that could turn speech into a scalable attack surface for identity systems, creating a world in which synthetic speech represents a real identity infrastructure risk.

The latest to join the likes of ElevenLabs and OpenAI in offering APIs for voice biometrics is xAI – the same firm that gave the world Grok the Deepfake Nude Machine. Marktechpost reports that the company has launched standalone Speech-to-Text (STT) and Text-to-Speech (TTS) APIs, “both built on the same infrastructure that powers Grok Voice on mobile apps, Tesla vehicles, and Starlink customer support.”

The market for speech APIs is getting busier. Rapid advances in voice AI are lowering costs and skill barriers for voice cloning, and companies such as Deepgram and AssemblyAI already have established user bases. Others will follow xAI into the market.

The cumulative result is an undermining of trust in voice as an authentication factor – and a need to rethink speaker biometrics in the context of agentic identity.

Grok, say ‘I need help’ in the voice of Morgan Freeman 

Grok’s APIs will make it even easier for millions of people to create believable synthetic voices. For text-to-speech, which converts written text into spoken audio, the API “delivers fast, natural speech synthesis with detailed control via speech tags, and is priced at $4.20 per 1 million characters.” It supports 20 languages and five distinct voices, and offers the ability to manipulate delivery with speech tags.

Grok’s record on nefarious use speaks for itself. What are the chances the same user base that flooded X with fake nudes will see the potential for fraud and mischief in the AI’s TTS API? It is a rhetorical question, but it has real-world implications for voice as a reliable biometric modality for identity infrastructure.

In recent weeks, ElevenLabs launched a system to enable companies to deploy AI agents. According to USA Today, the tool “allows teams to convert internal documentation and workflows into conversational agents, without the need for extensive technical development.”

“These agents are designed to follow structured processes, but deliver responses that sound natural within context.”

This month, Microsoft also launched three new foundational AI models, including a voice generation engine, MAI-Voice-1.

Consider how many phone calls already come from bots. Now consider how easily one might use AI to clone the voice of your loved one. The threshold for certainty is disappearing, at least without rigorous voice liveness and continuous monitoring. The question stands to become, is voice worth the risk?

Be careful whose voice offers an answer.

Voice AI expands attack surface for speaker biometrics as APIs proliferate

Voice AI expands attack surface for speaker biometrics as APIs proliferate
Deepfake voices are already a challenge for authentication systems. But the task is getting tougher, as big players pursue voice AI products that could turn speech into a scalable attack surface for identity systems, creating a world in which synthetic speech represents a real identity infrastructure risk.

The latest to join the likes of ElevenLabs and OpenAI in offering APIs for voice biometrics is xAI – the same firm that gave the world Grok the Deepfake Nude Machine. The company has launched standalone Speech-to-Text (STT) and Text-to-Speech (TTS) APIs, “both built on the same infrastructure that powers Grok Voice on mobile apps, Tesla vehicles, and Starlink customer support.”

The market for speech APIs is getting busier. Rapid advances in voice AI are lowering costs and skill barriers for voice cloning, and companies such as Deepgram and AssemblyAI already have established user bases. Others will follow xAI into the market.

The cumulative result is an undermining of trust in voice as an authentication factor – and a need to rethink speaker biometrics in the context of agentic identity.

Grok, say ‘I need help’ in the voice of Morgan Freeman

Grok’s APIs will make it even easier for millions of people to create believable synthetic voices. For text-to-speech, which converts written text into spoken audio, the API “delivers fast, natural speech synthesis with detailed control via speech tags, and is priced at $4.20 per 1 million characters.” It supports 20 languages and five distinct voices, and offers the ability to manipulate delivery with speech tags.

Grok’s record on nefarious use speaks for itself. What are the chances the same user base that flooded X with fake nudes will see the potential for fraud and mischief in the AI’s TTS API? It is a rhetorical question, but it has real-world implications for voice as a reliable biometric modality for identity infrastructure.

In recent weeks, ElevenLabs launched a system to enable companies to deploy AI agents. According to USA Today, the tool “allows teams to convert internal documentation and workflows into conversational agents, without the need for extensive technical development.”

“These agents are designed to follow structured processes, but deliver responses that sound natural within context.”

This month, Microsoft also launched three new foundational AI models, including a voice generation engine, MAI-Voice-1.

Consider how many phone calls already come from bots. Now consider how easily one might use AI to clone the voice of your loved one. The threshold for certainty is disappearing, at least without rigorous voice liveness and continuous monitoring. The question stands to become, is voice worth the risk?

Be careful whose voice offers an answer.

AI backlash is coming for elections

Ask Americans how they feel about AI and most say they have concerns. Communities have mounted resistance to data center projects, stalling them across the US. On social media, anger at AI companies and executives is unrestrained – sometimes to the point of condoning violence. But look at the issues that most campaigns are focused […]

Deezer says AI song uploads have nearly overtaken human music

Deezer says it receives nearly 75,000 AI-generated song submissions to its music streaming platform each day, accounting for about 44 percent of all daily uploads, as reported earlier by TechCrunch. Despite the increase in “fraudulent” uploads, Deezer says the consumption of AI songs makes up around 1 to 3 percent of total streams, as the […]

Canva’s CEO on its big pivot to AI enterprise software

Today, I’m talking with Melanie Perkins, founder and CEO of Canva, a popular online design tool. I always enjoy talking with Melanie. She was last on the show a couple of years ago, just as the AI revolution was coming to the worlds of art and design. At the time, Canva had escaped a lot […]