What a little more computing power can do
Neural networks have given researchers a powerful tool for looking into the future and making predictions. But one drawback is their insatiable need for data and computing power (“compute”) to process all that information. At MIT, demand for compute is estimated to be five times greater than what the Institute can offer. To help ease the crunch, industry has stepped in. An $11.6 million supercomputer recently donated by IBM comes online this fall, and in the past year, both IBM and Google have provided cloud credits to MIT Quest for Intelligence for distribution across campus. Four projects made possible by IBM and Google cloud donations are highlighted below.
Smaller, faster, smarter neural networks
To recognize a cat in a picture, a deep learning model may need to see millions of photos before its artificial neurons “learn” to identify a cat. The process is computationally intensive and carries a steep environmental cost, as new research attempting to measure artificial intelligence’s (AI’s) carbon footprint has highlighted.
But there may be a more efficient way. New MIT research shows that models only a fraction of the size are needed. “When you train a big network there’s a small one that could have done everything,” says Jonathan Frankle, a graduate student in MIT’s Department of Electrical Engineering and Computer Science (EECS).
With study co-author and EECS Professor Michael Carbin, Frankle estimates that a neural network could get by with on-tenth the number of connections if the right subnetwork is found at the outset. Normally, neural networks are trimmed after the training process, with irrelevant connections removed then. Why not train the small model to begin with, Frankle wondered? Hey
Read more here: