Nvidia breaks records in training and inference for real-time conversational AI
Nvidia’s GPU-powered platform for developing and running conversational AI that understands and responds to natural language requests has achieved some key milestones and broken some records that have big implications for anyone building on their tech — which includes companies large and small, as much of the code they’ve used to achieve these advancements is open source, written in PyTorch and easy to run.
The biggest achievements Nvidia announced today include its breaking the hour mark in training BERT, one of the world’s most advanced AI language models and a state-of-the-art model widely considered a good standard for natural language processing. Nvidia’s AI platform was able to train the model in less than an hour, a record-breaking achievement at just 53 minutes, and the trained model could then successfully infer (i.e. actually apply the learned capability achieved through training to achieve results) in just over two milliseconds (10 milliseconds is considered a high-water mark in the industry), another record.
Nvidia’s breakthroughs aren’t just cause for bragging rights — these advances scale and provide real-world benefits for anyone working with their NLP conversational AI and GPU hardware. Nvidia achieved its record-setting times for training on one of its SuperPOD systems, which is made up of 92 Nvidia DGX-2H systems runnings 1,472 V100 GPUs, and managed the inference on Nvidia T4 GPUs running Nvidia TensorRT — which beat the performance of even highly optimized CPUs by many orders of magnitude. But it’s making available the BERT training code, and TensorRT optimized BERT Sample via GitHub for all to leverage.
read more here: