Nvidia Brings AI to Cloud-Based Videoconferencing
GPU maker unveils Maxine, a platform for service providers that improves user experience and reduces costs
The COVID-19 pandemic has put a premium on cloud-based videoconferencing solutions that already were seeing an increase in demand before the public health crisis hit. Now Nvidia wants to use its deep expertise in artificial intelligence (AI) to make the experience even better.
The Lowdown: The giant GPU maker recently unveiled Maxine, a platform that developers and videoconferencing providers can use to make meetings more personal and less costly.
The Details: Maxine is a cloud-native streaming video platform that videoconferencing service providers can run in the cloud on Nvidia’s GPUs. The platform uses AI capabilities to improve resolution, reduce background noise, make video meetings seem more face-to-face, and improve lighting.
Because all of this is done in the cloud rather than on user devices, no specialized hardware is needed. In addition, Maxine reduces bandwidth needed for video calls, which saves costs. The AI software uses video compression technology that runs on the GPUs to analyze key facial points of each person on the call and intelligently re-animate the face in the video. This makes it possible to stream video with much less data flowing across the Internet rather than streaming the entire screen of pixels.
Bandwidth can be reduced to a tenth of the requirements needed in traditional video and reduces costs for providers.
Features that come from Nvidia’s work in generative adversarial networks (GANs) include:
> Face alignment: Faces are automatically adjusted so that people appear to be facing each other during a video call.
> Gaze correction: Eye contact is simulated, so that rather than looking down or to the side, participants appear to be looking right at each other.
> Avatars: Developers can enable call participants to use animated avatars rather than their own faces, and their voices and emotional tone drive the realistic animation.
> Auto frame: The option enables the video feed to follow the speaker even if they move away from the screen.
> Conversational AI: Developers can leverage Nvidia’s Jarvis software development kit (SDK) to integrate virtual assistants into the Maxine platform with such capabilities as speech recognition, language understanding, and speech generation. They also can take notes, set action items, and answer questions with human-like voices as well as provide translations, closed captioning, and transcriptions.
Videoconferencing service providers can run AI-based features simultaneously without harming latency and can support hundreds of thousands of users. Developers, software partners, startups, and computer makers that build audio and video apps and services can apply for early access to the Maxine platform here.
The Impact: The videoconferencing space already had been growing before the pandemic hit. The coronavirus outbreak has only accelerated that trend as work-from-home and distance learning have become the norm, and Nvidia and other vendors are looking to improve the user experience. Synergy Research Group found that spending on unified communications and collaboration tools grew 7% year-over-year in the second quarter, to more than $12 billion, with hosted and cloud solutions growing by 18%.
Background: Nvidia several years ago put a focus on AI and machine learning, arguing that its GPUs are better than traditional CPUs for running modern workloads. For Maxine, Nvidia – which is in the process of buying chip designer Arm for $40 billion – is leaning on such products as its AI SDKs and APIs, its DeepStream high-performance audio and video streaming SDK, and Nvidia TensorRT SDK for deep learning inference jobs. Many of the capabilities were built on the hundreds of thousands of hours of machine learning training on Nvidia’s DGX systems.
The Buzz: “Videoconferencing is now a part of everyday life, helping millions of people work, learn and play, and even see the doctor,” said Ian Buck, vice president and general manager of accelerated computing at Nvidia. “Nvidia Maxine integrates our most advanced video, audio and conversational AI capabilities to bring breakthrough efficiency and new capabilities to the platforms that are keeping us all connected.”
Related Links:
CHANNELNOMICS:
>Nvidia to Buy Arm in Major AI Push
>Remote Work Drives Nvidia Cloud Partner Program Growth
>Possible Moves by Intel, Nvidia Could Reshape Chip Market