I am nearly finished with my coursework in machine learning and artificial intelligence. As I wrap up,
I thought I would recommend a curriculum for others that are just beginning a similar journey - incorporating the benefit
of hindsight. My guiding philosophy is to build strong fundamental understanding. This leads to intuition, and ultimately
the ability to creatively solve new problems in multiple areas.
I completed about 25 MOOCs.
The courses listed below are the best of the best. I left off survey courses that that didn’t promote deep understanding.
Also, there are a lot of duplicate courses out there because this is a hot area right now. While it can be useful seeing the
same material presented from different perspectives, I chose the best presentation below.
recommend taking the courses in the order I’ve listed them. In some cases, the order is to avoid struggles due to missing
prerequisites. In other cases, the ordering will give you a better perspective on what is to come. I also avoided front-loading
all the math courses - it is important to have variety.
Note: It takes a high level of motivation and discipline to learn this material. I worked many extra problems
with pencil and paper for a fuller understanding. So, dig in when you don’t understand something. Don’t guess
and check the multiple choice problems and move on. Your goal is not to earn a bunch of MOOC certificates, your goal is to
learn these subjects.
This is the famous Andrew Ng course. In fact, if you’re
reading this, it is quite likely you’ve already completed this one. This is an excellent introduction to Machine Learning,
but it is not a university level course like Stanford’s on-campus CS229. Nevertheless, I still think this course is
a great place to start.
The lectures are fantastic with plenty of practical advice and a
bit of theory. The programming assignment all use MATLAB or Octave. MATLAB tutorials are provided at the start, with no prior
In the assignments, you build the classic machine learning algorithms
almost from scratch using basic matrix operations. The course covers all the greatest hits: linear regression, logistic regression,
gradient decent, regularization, neural networks, support vector machines, bias vs. variance and so on.
This course has no serious math or programming prerequisites. You will leave with incredible practical advice on applying
machine learning. And you will gain skills that you can apply right away in your current job or studies. This course lays
a great foundation, but you will need to take a more intense machine learning course later in the curriculum.
Python is the language
for AI and Machine Learning. This isn’t a Python course, but you will learn Python well enough to succeed in the upcoming
courses. The course develops a basic foundation in algorithmic thinking. You will finish with a solid understanding of object-oriented
This is a high-quality MOOC. It uses the EdX platform beautifully. The lectures
are excellent, and the programming assignments drive understanding. After taking this course you will be ahead of many of
your classmates in the MOOCs to come.
If you are just a couple years out of college, you can probably skip these
two courses. If not, and your calculus is rusty, you will need to take these. Otherwise, you will struggle with parts of probability,
and other courses that follow. Several courses take for granted a level of mathematical maturity. So, getting comfortable
manipulating equations again will pay off. These two courses will sharpen the skills you will need.
Like most of the MIT EdX courses, these two are excellent. Each course is 1/3rd of MIT’s first-semester calculus.
These are original MIT lectures that have been sliced into the EdX format. Professor David Jerison is an engaging lecturer.
I enjoy learning from someone so at ease with the material, who is teaching only with their voice and a piece of chalk.
Until the 1980s, Artificial Intelligence was hampered by its reliance
on boolean rule-based systems. There have been many breakthroughs in AI, but the introduction of probabilistic reasoning may
be one of the most important. This article on Professor Judea Pearl’s work and his ACM Turing Award does a nice job explaining the importance of introducing uncertainty to AI.
going to see a lot of probability in the coming courses. It takes time, but you will need to become very comfortable with
random variables, variance, expectations, Bayesian inference, Markov chains and so on. This one is not easy, but it will pay
big dividends. John Tsitsiklis’ lectures are incredible and you should own a copy of the text book.
Not long after completing this course, I decided to change careers
and focus on machine learning and AI. Suffice to say, I thought it was incredible. I have linked to the version that was taught
by Professor Dan Klein and Professor Pieter Abbeel. Although this course was once offered on the EdX platform, it is now only
available as self-study. While the deadlines and structure of a MOOC are nice, if you made it this far, you have the discipline
required to complete this course.
I remember watching in awe the result of wrapping a reinforcement
learning engine around a simulated one-armed robot. This robot flopped its arm around and, every once in a while, it got lucky
and dragged itself forward. The robot earns a reward when it stumbled forward. It was like watching a child learn how to crawl.
Eventually, this block robot, with a single hinged arm, was dragging itself almost gracefully across my computer screen.
The true highlight of the course are the Pacman-themed programming assignments. As the course goes
on you endow your Pacman agent with increasing levels of intelligence. These labs are tough but rewarding. I usually lost
track of time while completing them, often staying up late into the night.
Linear algebra is an essential tool for machine learning. Again,
this is not a MOOC with deadlines, discussion forums and quizzes every 10 minutes. You need to watch the lectures, do the
same problem sets as the MIT undergraduates, and take the exams. But the resources are available to make this very doable.
Professor Strang’s lectures are great - I enjoyed them more and more as the course progressed. And Strang’s text bookis excellent and fully worked solutions are available for all the problems.
No longer will
you think of matrix multiplication as a series of repeated dot-products. You will start to think in the row- and column-space.
You will understand the beauty and importance of eigenvalues and eigenvectors. Projections and their relationship to linear
regression will make perfect sense. The course closes with a beautiful treatment of singular value decomposition (SVD).
Algorithms (Stanford / Coursera)
To bring your ideas to life, you need a solid understanding
of Algorithms. I chose the Stanford course taught by Professor Tim Roughgarden and I was pleased with my choice. The professor
had a contagious enthusiasm for the subject. The course was rigorous, with most important algorithms proved for correctness
and performance (running time and space) in lecture. The programming assignments were well thought out and the Coursera forums
were fairly active in the offering I took.
By the end, you will have a solid ability to implement
most of the algorithm “Greatest Hits” (a phrase Professor Roughgarden used frequently).
* I also completed the first half of MIT’s 6.006 algorithms course on OCW. I thought this was very good too, with terrific lectures and thoughtful assignments. So, 6.006 is a good alternative
if the Stanford course doesn’t suit you.
Back to machine learning. I mentioned you would eventually need to
take a more rigorous, theoretical course on machine learning. This course the same as the on-campus CalTech course taught
by Professor Abu-Mostafa. From the first lecture, it is obvious how much care and thought the professor has put into choosing
exactly what to teach and how to teach it.
The course begins, appropriately, by answering
the question “Is Learning Feasible?” This question is fundamental to what we are trying to accomplish with machine
learning. It is important at some point to address this question head-on. Then, to quote from the course text, “From
over a decade of teaching this material, we have distilled what we believe to be the core topics that every student of the
subject should know.” The course doesn’t try to cover every learning algorithm or recipe. But rather Professor
Abu-Mostafa has carefully chosen what to teach with a clear purpose. When the course ends you are prepared to go off in many
directions with a solid foundation.
The course does an in-depth treatment of linear and logistic
regression, support vector machines (SVM), neural networks, and clustering. Take the time to read the e-Chapters from the
book, they are all very important. Complete the exercises, and take the time to understand the EM algorithm derivation.
* I will mention a 2nd option. Professor Ng’s full CS229 course is available through Stanford Engineering Everywhere. This includes all the assignments and actual on-campus lectures. I have no doubt this course is excellent.
This course by Professor David MacKay is a gem. I was saddened
to find out that Professor MacKay has passed away - much too early. The course consists of the recorded lectures and the accompanying text. It is an advanced course and builds nicely on everything you have learned so far. With this course, you begin to set yourself
apart from the crowd.
Professor Mackay takes a unique and inspired approach to teaching machine
learning. He begins with a review of probability, entropy, inference and information theory. Chapter 4 closes with a beautiful
treatment of Shannon’s Source Coding Theorem. Work every problem presented by the cartoon rat! This will cement several
topics you have learned so far.
After Chapter 4, you can proceed directly to Chapters 19-46.
These are short, beautifully written chapters. Each is only 12-14 pages including the exercises and solutions. Do all the
exercises that have solutions. Note: Not all of these chapters have corresponding lecture videos.
Here are a few things you will understand deeply after completing this course:
- Bayesian model comparison and the Occam factor
- Variational methods
- Monte Carlo methods:
- Metropolis method, Gibbs sampling, Hamiltonian Monte Carlo, Overrelaxation
Networks and Boltzmann Machines
- Addressing high-dimensionality
The return on
your effort in this course is very high.
This is an advanced course on Neural Networks taught
by Professor Geoff Hinton. There are only a few programming assignments, and by now these will be very easy for you. So, to
get the most out of this course you will need to invent a few side projects. Here are some ideas:
- Implement a Restricted Boltzmann machine
- Implement a RNN
- With both gated (e.g. LSTM or GRU) and non-gated cells
- Experiment with dropout
a deep net and pre-train layers
I found the book Deep Learning to be an excellent companion to this course. I recommend that you read this book cover-to-cover. Also, read the papers
that Professor Hinton has attached to the course material.
This is the only course I was uncertain about recommending.
It is an important topic, and this is one of only a few graduate-level MOOCs available. This particular course textbook is
difficult to read and the lectures can leave gaps in understanding. There are very few students participating in the forums
and no official course TAs or mentors.
All that said, I am very glad I completed it. It forced
me to better understand several topics: Markov Networks, variational methods, EM algorithm, and energy-based probability models.
You should be better prepared for this course than I was: you will have completed David MacKay’s course.
To make the course worthwhile, you must take it with the “Honors” option. The honors programming assignments
are critical to learning the material. They are challenging, with at least one taking me 15 hours to complete.
Your Foundation is Built. What Next?
point, you have built the foundation you need to head off in many different directions and excel. From here, you may want
to choose a specialization and take a couple of additional courses. Are you more interested in machine learning or AI? The
distinction between the two being fuzzy sometimes. Here are some suggestions:
Most importantly, put your knowledge into practice. This
is where real learning takes place: solving problems where a professor hasn’t carefully planned your path. Better yet,
find a job where you can work with experts in the field. While you are looking for a job, do some challenging projects to
highlight your abilities. Document your work and post your code.
This surely seems intimidating,
but make forward progress each day and you’ll be there before you know it. I hope you enjoy your journey as much as
I have enjoyed mine.