Published 14:54 3 May 2024 GMT+1

Microsoft launches eerie AI bot that can make any image talk with creepily realistic results

It isn't being released to the public for obvious reasons.

Rebekah Jordan

If only pictures could talk... well, now they can.

From helping video content creators maintain eye contact with the camera and producing stock footage, AI is being used more and more creatively every day.

Now, tech giant Microsoft is launching an AI model named VASA-1 that brings pictures to life.

Advert

The technology animates still images of people's faces, making them talk or sing.

'Our method is capable of not only producing precious lip-audio synchronization, but also generating a large spectrum of expressive facial nuances and natural head motions,' the tech giant described on its official website.

'It can handle arbitary-length audio and stably output seamless talking face videos.'

Microsoft

Advert

Whilst it might sound like Apple Live Photos 2.0, it's still got quite a while to go to perfect. For example, the virtual people seem to move quite a lot whilst speaking and their teeth visibility changes shape mid-conversation.

'Currently, the videos generated by this method still contain identifiable artifacts, and the numerical analysis shows that there's still a gap to achieve the authenticity of real videos,' Microsoft explained.

But imagination certainly shines. In one example, AI has made another side of Leonardo da Vinci's famous Mona Lisa come out as a rapper with an American accent.

Microsoft researchers describe VASA as a 'framework for generating lifelike talking faces of virtual characters.

Advert

The company said: 'It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors.

'Our method is capable of not only producing precious lip-audio synchronisation, but also capturing a large spectrum of emotions and expressive facial nuances and natural head motions that contribute to the perception of realism and liveliness.'

Microsoft

However, Microsoft claims the tool could be exploited and 'misused for impersonating humans' and is therefore, not releasing it for public use.

Advert

Moreover, experts have warned of the risks of making people appear to say things that they never said.

It also opens up the risks of fraud, as people online could be deceived by fake messages from seemingly trusting images of people they know.

Jake Moore, a security specialist at ESET, said 'seeing is most definitely not believing anymore.'

He continued: 'As this technology improves, it is a race against time to make sure everyone is fully aware of what is capable and that they should think twice before they accept correspondence as genuine.'

Advert

As a response to potential public concerns, the corporation has said that VASA-1 is 'not intended to create content that is used to mislead or deceive.'

They added: 'However, like other related content generation techniques, it could still potentially be misused for impersonating humans.

'We are opposed to any behavior to create misleading or harmful contents of real persons, and are interested in applying our technique for advancing forgery detection.'

Featured Image Credit: Microsoft

Microsoft

Science

Tech News