Deep Learning Fundamentals

About this course

Machine learning uses computers to run predictive models that learn from existing data to forecast future behaviors, outcomes, and trends. Deep learning is a sub-field of machine learning, where models inspired by how our brain works are expressed mathematically, and the parameters defining the mathematical models, which can be in the order of few thousands to 100+ million, are learned automatically from the data.

Deep learning is a key enabler of AI powered technologies being developed across the globe. In this deep learning course, you will learn an intuitive approach to building complex models that help machines solve real-world problems with human-like intelligence. The intuitive approaches will be translated into working code with practical problems and hands-on experience. You will learn how to build and derive insights from these models using Python Jupyter notebooks running on your local Windows or Linux machine, or on a virtual machine running on Azure. Alternatively, you can leverage the Microsoft Azure Notebooks platform for free.

This course provides the level of detail needed to enable engineers / data scientists / technology managers to develop an intuitive understanding of the key concepts behind this game changing technology. At the same time, you will learn simple yet powerful "motifs" that can be used with lego-like flexibility to build an end-to-end deep learning model. You will learn how to use the Microsoft Cognitive Toolkit — previously known as CNTK — to harness the intelligence within massive datasets through deep learning with uncompromised scaling, speed, and accuracy.

What you'll learn

The components of a deep neural network and how they work together
The basic types of deep neural networks (MLP, CNN, RNN, LSTM) and the type of data each is designed for
A working knowledge of vocabulary, concepts, and algorithms used in deep learning
How to build:

An end-to-end model for recognizing hand-written digit images, using a multi-class Logistic Regression and MLP (Multi-Layered Perceptron)
A CNN (Convolution Neural Network) model for improved digit recognition
An RNN (Recurrent Neural Network) model to forecast time-series data
An LSTM (Long Short Term Memory) model to process sequential text data

Prerequisites

Basic programming skills
Working knowledge of data science
Skills equivalent to the following courses:

DAT208x: Introduction to Python for Data Science

Course Syllabus

Week 1:Introduction to deep learning and a quick recap of machine learning concepts.
Week 2:Building a simple multi-class classification model using logistic regression
Week 3: Detecting digits in hand-written digit image, starting by a simple end-to-end model, to a deep neural network
Week 4:Improving the hand-written digit recognition with convolutional network
Week 5:Building a model to forecast time data using a recurrent network
Week 6:Building text data application using recurrent LSTM (long short term memory) units

Meet the instructors

Jonathan Sanito

Senior Content Developer
Microsoft

Jonathan works as a content developer and project manager for Microsoft focusing in Data and Analytics online training. He has worked with trainings for developer and IT pro audiences, from Microsoft Dynamics NAV to Windows Active Directory.

Before coming to Microsoft, Jonathan worked as a consultant for a Microsoft partner, implementing Microsoft Dynamics NAV solutions.

Sayan Pathak

Principal ML Scientist and AI School Instructor, CNTK team
Microsoft

Sayan is a Principal Engineer and Machine Learning Scientist in CNTK team at Microsoft. He has published and commercialized cutting edge computer vision and machine learning technology to big data problems applied to medical imaging, neuroscience, computational advertising and social network domains. Prior to joining Microsoft, he worked at Allen Institute for Brain Science. He has been a consultant to several startups and principal investigator on several US National Institutes of Health (NIH) grants. He is a faculty at the University of Washington for 15 years and is Affiliate Professor in CSE at the Indian Institute of Technology, Kharagpur, India for over 4 years. He has taught several courses (namely Image Computing Systems, Information Retrieval, Social Networks, Machine Learning) at the undergraduate and graduate level. He has served in committees of several doctoral and masters students.

Roland Fernandez

Senior Researcher and AI School Instructor, Deep Learning Technology Center
Microsoft Research AI

Roland works as a researcher and AI School instructor in the Deep Learning Technology Center of Microsoft Research AI. His interests include reinforcement learning, autonomous multitask learning, symbolic representation, AI education, information visualization, and HCI. Before coming to the DLTC, Roland worked in the VIBE group of MSR doing visualization and HCI projects, most notably the SandDance project. Before MSR, Roland worked (at Microsoft and other companies) in the areas of Natural User Interfaces, Activity Based Computing, Advanced Prototyping, Programmer Tools, Operating Systems, and Databases.

Enrollment in this course is by invitation only