The Anatomy of a Deepfake Scam
The scam is devastatingly simple and effective. Fraudsters find a few seconds of a person's voice online, often from a social media post, Instagram Reel, or even a public YouTube video. Using widely available and increasingly sophisticated AI tools, they
clone this voice. The clone can then be used to say anything the scammer types, complete with the original person's tone and inflection. The scammer, now armed with a trusted voice, calls a family member—often a parent or grandparent—with an urgent story: a car crash, a wrongful arrest, a medical emergency. The goal is to create panic and short-circuit rational thought, pressuring the victim to transfer money immediately via UPI or other instant payment methods before they have a chance to verify the story. This exploits human psychology, using fear and trust to bypass all traditional security measures.
Why Current Security Is Not Enough
WhatsApp is built with end-to-end encryption and offers security features like two-factor authentication (2FA). While these are crucial for protecting your account from being hacked, they are utterly powerless against voice cloning scams. These scams don't break the app's security; they break the user's trust. The fraud happens outside the digital code, in the realm of emotional manipulation. Even recent features, like warning users before they message an unknown number, are helpful but insufficient. Scammers can spoof phone numbers to make the call appear to come from a saved contact, and the sheer believability of a loved one's voice in distress is often enough to override any warning pop-ups. The problem is particularly acute in India, where 47% of adults have either been a victim of or know someone who has faced an AI voice scam, nearly double the global average.
The Argument for Verbal Proofs
If technology creates the problem, a low-tech human solution might be the answer. This is where multi-factor verbal proofs come in. The concept is a modern adaptation of a classic espionage technique: a pre-established code word or question-and-answer pair known only to a trusted circle. Unlike multi-factor authentication (MFA) on an app, which uses something you have (your phone) and something you know (a PIN), a verbal proof is an interactive challenge that a scammer's AI cannot anticipate or answer. It's a 'safe word' for financial emergencies. When a call demanding urgent funds comes in, no matter how convincing the voice, the protocol is to ask for the verbal proof. If the caller cannot provide it, the call is immediately identified as fraudulent.
Putting Verbal Codes into Practice
Implementing this defence is simple. Families should agree on a secret word or phrase that is unique and not easily guessable from their social media profiles. Avoid common names, birthdays, or pet names. A good safe word is often a shared memory or an inside joke, something an outsider could never know. For example, instead of "What was my first dog's name?" a better question might be, "What did we call the car with the broken radio?" The key is to make it a non-negotiable family rule: no proof, no money. It needs to become as ingrained a habit as locking the front door. Experts recommend hanging up and calling the person back on their known number as a first step, but a verbal proof adds another, more robust layer of security for situations where a call-back isn't immediately possible.
The Responsibility of Platforms and People
While individuals and families must adapt, platforms like WhatsApp have a role to play. They could integrate features that help normalize this behaviour, such as prompts during call setup that ask if a user wants to establish a verbal password with a contact. They could run large-scale public awareness campaigns in India, educating users about the rise of voice scams and the simple, effective defence of using a family safe word. The Indian government is already showing concern, having recently halted a WhatsApp username feature over fears it could increase fraud. This indicates an appetite for stronger user protections. The ultimate defence, however, is a prepared and sceptical public that understands that the sound of a familiar voice is, unfortunately, no longer definitive proof of identity. The trust we place in our own ears must now be backed by a secret we hold in our minds.


















