Learn about neon™ with the Nervana Deep Learning Course

2016-07-16

Intel Nervana is excited to share a series of short Nervana videos and accompanying exercises to learn how to build deep learning models with neon, our deep learning framework. We start with a basic introduction into deep learning concepts, provide an overview of the neon framework, and discuss key neon concepts such as loading data and defining branching architectures. This will be a living series, so check back for more updates and videos!

You can also find more resources, including pre-trained models, Kaggle challenge scripts, videos from our meetups, and more here.

Video Sessions

01 Deep learning introduction

This video introduces the basic deep learning concepts necessary to both understand the neon codebase and build your own deep learning models. We discuss how deep learning is different from traditional machine learning, and cover basic concepts such as: supervised learning, backpropagation, stochastic gradient descent, activation functions, and the basic linear unit.

02 Recurrent neural networks

For sequence data such as speech or text, recurrent neural networks (RNNs) are often used to capture the short and long term temporal dependencies in the data. Training RNNs is challenging because of the vanishing gradient problem. We introduce the Long-Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks that are designed to combat the vanishing gradient problem.

03 Convolutional Neural Networks

For images and other data where ordering in the spatial dimensions have meaning, convolutional neural networks have proven to be effective networks. In this video, we discuss 1D, 2D, and 3D convolutional networks, and review recent CNN architectures that have enabled deeper and more powerful models (VGG, ResNet, etc.).

04 Neon Workflow

The neon deep learning framework provides an easy python-based approach to getting started with deep learning. Here we introduce the basic modules within neon and how to construct models and use our command line arguments to customize training runs. We recommend viewing this video before trying our jupyter notebooks. The MNIST Example and Fine-tuning VGG notebooks below are useful companions.

05 Neon Concepts

In this video, we discuss two key concepts within neon: loading data into neon, and defining complex branching architectures. neon provide four different ways to load your data for training, depending on your data size and complexity. Several notebooks guide you through writing a custom dataset object, custom activation functions and layers, custom callbacks, and defining a complex branching model.

06 Nervana Cloud

Some of our notebooks require GPUs because of memory and speed constraints. Our Nervana Cloud provides an easy interface to launch training jobs on our GPU servers. Trained models can also be deployed on a server to receive incoming inference requests via a REST API. This video demonstrates how to launch jobs, inspect progress, and deploy a trained job for inference.

One of our popular cloud features is interactive mode, where users can launch a jupyter notebook server running on our GPUs and access the notebook through their web browser to interactively step through code for debugging or exploration.

Exercises

The above videos are accompanied by several jupyter notebooks found at https://github.com/NervanaSystems/neon_course that are guided exercises through key concepts in neon and common operations.

The jupyter notebooks in this repository include:

  1. MNIST example

Comprehensive walk-through of how to use neon to build a simple model to recognize handwritten digits. Recommended as an introduction to the neon framework.

  1. Fine-tuning

A popular application of deep learning is to load a pre-trained model and fine-tune on a new dataset that may have a different number of categories. This example walks through how to load a VGG model that has been pre-trained on ImageNet, a large corpus of natural images belonging to 1000 categories, and re-train the final few layers on the CIFAR-10 dataset, which has only 10 categories.

  1. Writing a custom dataset object

neon provides many built-in methods for loading data from images, videos, audio, text, and more. In the rare cases where you may have to implement a custom dataset object,his notebooks guides users through building a custom dataset object for a modified version of the Street View House Number (SVHN) dataset. Users will not only write a custom dataset, but also design a network to, given an image, draw a bounding box around the digit sequence.

  1. Writing a custom activation function and a custom layer

This notebook walks developers through how to implement custom activation functions and layers within neon. We implement the Affine layer, and demonstrate the speed-up difference between using a python-based computation and our own heavily optimized kernels.

  1. Defining complex branching models

When simple sequential lists of layers do not suffice for your complex models, we present how to build complex branching models within neon.

  1. Deep Residual network on the CIFAR-10 dataset

In neon, models are constructed as python lists, which makes it easy to use for-loops to define complex models that have repeated patterns, such as deep residual networks. This notebook is an end-to-end walkthrough of building a deep residual network, training on the CIFAR-10 dataset, and then applying the model to predict categories on novel images.

  1. Writing a custom callback

Callbacks allow models to report back to users its progress during training. In this notebook, we present a callback that plots training cost in real-time within the jupyter notebook.

  1. Detecting overfitting

Overfitting is often encountered when training deep learning models. This tutorial demonstrates how to use our visualization tools to detect when a model has overfit on the training data, and how to apply Dropout layers to correct the problem.

Related Blog Posts

neon™ 2.0: Optimized for Intel® Architectures

neon™ is a deep learning framework created by Nervana Systems with industry leading performance on GPUs thanks to its custom assembly kernels and optimized algorithms. After Nervana joined Intel, we have been working together to bring superior performance to CPU platforms as well. Today, after the result of a great collaboration between the teams, we…

Read more

#neon

Intel® Nervana™ Graph Beta

We are building the Intel Nervana Graph project to be the LLVM for deep learning, and today we are excited to announce a beta release of our work we previously announced in a technical preview. We see the Intel Nervana Graph project as the beginning of an ecosystem of optimization passes, hardware backends and frontend…

Read more

#Intel Nervana Graph #neon

Training Generative Adversarial Networks in Flexpoint

Training Generative Adversarial Networks in Flexpoint With the recent flood of breakthrough products using deep learning for image classification, speech recognition and text understanding, it’s easy to think deep learning is just about supervised learning. But supervised learning requires labels, which most of the world’s data does not have. Instead, unsupervised learning, extracting insights from…

Read more

#neon