Microsoft Speeds Up Deep Learning Training From Weeks to Minutes
Scientists and engineers normally create AI that can learn via deep learning, and training them usually takes up weeks. However, it seems Microsoft and scientists from the Swiss National Computing Centre have taken this one step further and trained AI via deep learning within minutes!
This is an astonishing feat in the realm of robotics and artificial intelligence, one that could easily generate results within hours or minutes.
This, coupled with the introduction of supercomputing technology, customers will now have the ability to solve problems such as image, video and speech recognition, as well as natural language processing. This can enable future researchers to be able to make technologies that could only previously been seen in science fiction. Research of this variety can greatly benefit multiple fields given the complexities of the programming deep learning could offer.
According to Next Big Future, the team of researchers has scaled the Microsoft Cognitive Toolkit -- the open-source code that trains deep learning algorithms -- to more than 1,000 Nvidia Tesla P100 GPU accelerators on the Swiss lab's Cray XC50 supercomputer, nicknamed Piz Daint.
As explained by the team of researchers, deep learning problems share similar algorithms with applications that run on a massively parallel supercomputer. By using inter-node communication by using the Cray XC Aries network alongside a high-performance MPI library, each training job can use more computational resources.
According to ZDNet, Thomas Schulthess, the director of the National Supercomputing Centre, said that this new scale will allow researchers to tackle a new class of deep learning problems that were previously thought that will take on months to complete.
Xuedong Huang, an engineer from Microsoft AI and Research, said this research can even push the boundaries of deep learning as it represents a powerful breakthrough for training and evaluating deep learning algorithms. The results of this breakthrough allow researchers to run larger, more complex deep learning workloads.