AlexNet was not the first fast GPU-implementation of a CNN to win an image recognition contest. A CNN on GPU by K. Chellapilla et al. was 4 times faster than an equivalent implementation on CPU. A deep CNN of Dan Cireșan et al. at IDSIA was already 60 times faster and achieved superhuman performance in August 2011. Between May 15, 2011 and September 10, 2012, their CNN won no fewer than four image competitions. They also significantly improved on the best performance in the literature for multiple image databases. According to the AlexNet paper, Cireșan's earlier net is "somewhat similar." Both were originally written with CUDA to run with GPU support. In fact, both are actually just variants of the CNN designs introduced by Yann LeCun et al. who applied the backpropagation algorithm to a variant of Kunihiko Fukushima's original CNN architecture called "neocognitron." The architecture was later modified by J. Weng's method called max-pooling. In 2015, AlexNet was outperformed by Microsoft Research Asia's very deep CNN with over 100 layers, which won the ImageNet 2015 contest.
Network design
AlexNet contained eight layers; the first five were convolutional layers, some of them followed by max-pooling layers, and the last three were fully connected layers. It used the non-saturating ReLUactivation function, which showed improved training performance over tanh and sigmoid.
Influence
AlexNet is considered one of the most influential papers published in computer vision, having spurred many more papers published employing CNNs and GPUs to accelerate deep learning., the AlexNet paper has been cited over 61,000 times.
Alex Krizhevsky
Alex Krizhevsky is a computer scientist most noted for his work on artificial neural networks and deep learning. Shortly after having won the ImageNet challenge 2012 through AlexNet, he and his colleagues sold their startup DNN Research Inc. to Google. Krizhevsky left Google in September 2017 when he lost interest in the work. At the company Dessa, Krizhevsky will advise and help research new deep-learning techniques. Many of his numerous papers on machine learning and computer vision are frequently cited by other researchers.