Login with your Social Account

The world's first voice AI crime

The world’s first voice AI crime

We might have witnessed the first crime powered by artificial intelligence in the world where synthetic audio was used for imitating the voice of a chief executive to trick his subordinate in transferring an excess of 240,000 US Dollars into a secret account.

The insurer of the company, Euler Hermes did not name the company which was involved. On a fateful day, the company’s managing director was called and a voice resembling his superiors instructed him to wire the amount to an account-based in Hungary. The money was supposed to cut on the fines of late payment and the financial details of the transaction were sent over email while the managing director was on call. Euler Hermes said that the software was able to mimic the voice along with the exact German accent, tonality, and punctuation. 

The thieves tried for a second attempt to transfer funds in this manner but this time the managing director suspected the intentions and directly called his boss. While he was on the phone with his real boss, the artificial voice demanded to speak to him. 

Deepfakes have been growing in complexity in the last few years. It cannot be detected easily by the online platforms and companies have struggled to handle it. Its constant evolution has made it clear that simple detection would not serve any purpose as it has been gaining an audience through monetization and constant generation of viral content. There are apps that can put someone’s face on any actor’s film clips making it a source of entertainment. This kind of technology would have sounded fancy a few years ago but now it can be misused by anyone who has a creative bent of mind channeling in improper ways. There are many positive uses too. It can be used in humanizing the automated call systems and help the mute people to talk again. But if unregulated it can cause fraud and cybercrime on a massive scale. 

Cybersecurity firm Symantec reported that they managed to find a minimum of three instances where the executives’ voice was mimicked to loot cash from the companies. However, it did not comment on the victim companies. 

In this technique, a person’s voice is processed and broken down into its phonetic fundamentals such as syllables or sounds. Then these can be rearranged to form new phrases with a similar tone and speech patterns. 

Researchers are working hard to develop systems that can detect fraud audio clips and combat them. While Google, on the other hand, has also created one of the world’s most persuasive AI services, Duplex service that can call restaurants for booking tables in a simulated, lifelike voice. 

Scientists have to be cautious when these technologies are released along with framing proper systems to fight scams. 

algorithm image deep learning

Researchers develop deep neural network to identify deepfakes

It was normally considered that seeing is believing until we learnt that photo editing tools can be used to alter the images we see. Technology has taken this one notch higher where facial expressions of one person can be mapped onto another in realistic videos known as deepfakes. However, each of these manipulations is not conclusive as all the image and video editing tools leave traces to be identified. 

A group of researchers led by Video Computing Group of Amit Roy Chowdhury at the University of California, Riverside has created a deep neural network architecture which can detect the manipulated images at pixel level with very high accuracy. Amit Roy Chowdhury is a professor of computer science and electrical engineering at Rosemary Bourns College of Engineering. He is also a Bourns Family Faculty Fellow. The study has been published in the IEEE Digital Library

As per artificial intelligence researchers, the deep neural network is a computer system which has been trained to perform specific tasks which, in this case, identify the altered images. The networks are organised in several connected layers. 

Objects which are present in images have boundaries and whenever an object is removed or inserted to an image, its boundary will be different than the boundary which is normally present. People having good Photoshop skills will try their best to make these boundaries look natural. Examining pixel by pixel brings out the differences in the boundaries. As a result, by checking the boundaries, a computer can distinguish between a normal and an altered image. 

Scientists labelled the images which were not manipulated and relevant pixels in the boundaries of the altered images in a large photo dataset. The neural network was fed the information about manipulated and the natural regions of the images. Then it was tested with a training dataset of different images and it could successfully detect the manipulated images most of the times along with the region. It provided the probability of the image being a manipulated one. Scientists are working with still images as of now, but this technique can also be used for deepfake videos. 

Roy Chowdhury pointed out that a video is essentially a collection of still images, so the application for a still image will also be applied to a video. However, the challenge lies to figure out if a frame in a video is altered or not. It is a long way to go before deepfake videos are identified by automated tools.

Roy Chowdhury pointed out that in cybersecurity, the situation is similar to a cat mouse game, with better defence mechanisms the attackers also come up with better alternatives. He pointed out that a combination of a human and automated system is the right mix to perform the tasks. Neural networks can make a list of suspicious images and videos to be reviewed by people. Automation can then help in the amount of data to be sifted through to determine if an image has been altered or not. He said that this might be possible in a few years’ time with the help of the technologies.

Journal Reference: IEEE Digital Library

Samsung Ai Deepfake

Latest Samsung AI can produce animated deepfake with a single image

Samsung’s AI tech is amazingly creepy and will make our deepfake problem worse as the Samsung engineers from the Samsung AI Center and the Skolkovo Institute of Science and Technology Moscow developed an AI technique using algorithms. It has the ability to transform an image into an animoji like video with face making speech expressions, with the only difference that it would not be exactly like your video. It uses AI to morph the face of the person and get all expressions right, which makes it difficult to recognise if its a video of the person or a morphed one. This requires deepfake to possess huge data sets of images in order to create a realistic forgery.

Artificial intelligence system used by Samsung is different from other modern-day technologies as it does not use 3D modelling and hence, can generate a fake clip which is directly proportional to the number of images.

According to researchers, working on this technology claim that it can be used for a host of applications which includes video games, film and TV and can be applied to paintings also. Though Samsung’s AI is cool, it comes with its own banes. It is sometimes used to morph the pictures of celebrities and other people for doing anti-social activities and may result in an increasing number of crimes.

This new approach provides a decent improvement on past work by teaching the neural network how to manipulate the existing landmark facial features into a more realistic-looking moving video. This knowledge can then be used to be deployed on a few pictures or maybe a single picture of someone, whom the AI has never seen before. The system uses a convolution neural network, a type of neural network based on biological processes in the animal visual cortex. It’s particularly adept at processing stacks of images and recognising what is there in them, that is, the “convolution” essentially recognises and extracts parts of those images.

This technique manages to solve the complexities existing in the artificially generated talking heads system, which is basically dealing with the increasing complexity of the heads, along with our ability to be able to easily spot a fake head. Nowadays, software for making deepfakes is available free of cost and it creates fully convincing lifelike videos. We have to just remember that seeing is no more believing. The research has been published on the pre-print server arXiv.org.

Though this technology looks pretty cool, I strongly think that this technology should not be made available for the public as it can lead to many fake videos and fake news. What do you think about this technology? Tell us with a short a quick comment