We might have witnessed the first crime powered by artificial intelligence in the world where synthetic audio was used for imitating the voice of a chief executive to trick his subordinate in transferring an excess of 240,000 US Dollars into a secret account.
The insurer of the company, Euler Hermes did not name the company which was involved. On a fateful day, the company’s managing director was called and a voice resembling his superiors instructed him to wire the amount to an account-based in Hungary. The money was supposed to cut on the fines of late payment and the financial details of the transaction were sent over email while the managing director was on call. Euler Hermes said that the software was able to mimic the voice along with the exact German accent, tonality, and punctuation.
The thieves tried for a second attempt to transfer funds in this manner but this time the managing director suspected the intentions and directly called his boss. While he was on the phone with his real boss, the artificial voice demanded to speak to him.
Deepfakes have been growing in complexity in the last few years. It cannot be detected easily by the online platforms and companies have struggled to handle it. Its constant evolution has made it clear that simple detection would not serve any purpose as it has been gaining an audience through monetization and constant generation of viral content. There are apps that can put someone’s face on any actor’s film clips making it a source of entertainment. This kind of technology would have sounded fancy a few years ago but now it can be misused by anyone who has a creative bent of mind channeling in improper ways. There are many positive uses too. It can be used in humanizing the automated call systems and help the mute people to talk again. But if unregulated it can cause fraud and cybercrime on a massive scale.
Cybersecurity firm Symantec reported that they managed to find a minimum of three instances where the executives’ voice was mimicked to loot cash from the companies. However, it did not comment on the victim companies.
In this technique, a person’s voice is processed and broken down into its phonetic fundamentals such as syllables or sounds. Then these can be rearranged to form new phrases with a similar tone and speech patterns.
Researchers are working hard to develop systems that can detect fraud audio clips and combat them. While Google, on the other hand, has also created one of the world’s most persuasive AI services, Duplex service that can call restaurants for booking tables in a simulated, lifelike voice.
Scientists have to be cautious when these technologies are released along with framing proper systems to fight scams.