#HTE

High Quality Videochat with Low Bandwidth: Use a Still Photo of You While AI Moves Your Mouth

fThe graphics wizards at NVIDIA have figured out how we can all have high image quality videoconferences even with crappy bandwidth.

image

The standard method of videoconferencing uses a camera capturing pixels that must be transmitted over the connection. For every second we speak on camera, moving our faces, millions of pixels must be sent. As the system chokes on all of the pixels, the image quality is dialed down.

NVIDIA’s system, Maxine, does not work by transmitting pixels. Instead the videoconference is started with a keyframe, a still image, of the speaker’s face. Then, as the speaker begins speaking and moving their face, Maxine’s AI-powered software only captures facial keypoints and transmits those over the network. Software on the receiving side then translates those keypoints and re-renders the speaker’s face accordingly.

image

It’s quite clever, and the difference is very noticeable:

image

image

image

image

Here’s what it looks like on video. Note that the software can even change the angle of your gaze:

NVIDIA refers to Maxine as AI video compression. You can learn more about it here.

image
https://www.core77.com/posts/102244/High-Quality-Videochat-with-Low-Bandwidth-Use-a-Still-Photo-of-You-While-AI-Moves-Your-Mouth