Sometime last year in October, I decided to learn more about big data, machine learning and predictive analytics. I gave Coursera a try and enrolled in the 10 weeks Machine Learning class by Prof Andrew Ng. from Stanford University [1-4]. Prof Ng. is one of the world renowned experts in the field of machine learning, the director of the Stanford AI Lab, a truly amazing teacher and one of the co-founder of Coursera.
For those who do not know Coursera: Coursera is an educational technology company which is offering free massive open online courses. It has cooperations with universities all around the globe and offers courses in computer science, engineering, physics, humanities, medicine, biology, social sciences, mathematics and business.
As I said, the machine learning class is completely free and held online. It consists of video lectures, multiple choice assignments and programming exercises. The workload was estimated to be around 5-7 hours a week, but you can easily spend more time as soon as you dig any deeper into the material. In this article I want to share my experiences with you, and highly recommend you to enroll in the next session which starts in around two weeks.
Syllabus: What will you learn?
The lectures are split thematically into chunks of 8-15 minutes videos and the following topics will be discussed in the course:
Linear Regression with One Variable
Linear Regression with Multiple Variables
Neural Networks: Representation
Neural Networks: Learning
Advice for Applying Machine Learning
Machine Learning System Design
Support Vector Machines (SVMs)
Large-Scale Machine Learning
For every topic there is a multiple choice assignment which needs to be handed in on a weekly basic. The assignments can be handed in as often as you want, as long as they are submitted within the deadline. The multiple choice assignments are not very difficult: You should complete them to check your understanding of the materials. The far more challenging part are the programming assignments.
For the programming assignments, you can either choose to use Matlab  or Octave (a free and open-source Matlab alternative ). If you want to use Matlab, you can even get a Free Student License from Stanford for the duration of the class. A programming assignments always consists of a set of data relevant for the exercise and some surrounding code which calls the methods you are about to implement. You are then just required to implement a specific part of an algorithm (a method). For example the function projectData(X, U, K) below was part of the exercise on Principal Component Analysis and performs a projection of data into a lower dimensional space:
function Z = projectData(X, U, K) %PROJECTDATA Computes the reduced data representation when projecting only %on to the top k eigenvectors % Z = projectData(X, U, K) computes the projection of % the normalized inputs X into the reduced dimensional space spanned by % the first K columns of U. It returns the projected examples in Z. % % You need to return the following variables correctly. Z = zeros(size(X, 1), K); % ====================== YOUR CODE HERE ====================== % Instructions: Compute the projection of the data using only the top K % eigenvectors in U (first K columns). % For the i-th example X(i,:), the projection on to the k-th % eigenvector is given as follows: % x = X(i, :)'; % projection_k = x' * U(:, k); % Ureduce = U(:,1:K); for i=1:size(X,1), % you do stuff, which I removed not to spoil the solution... end % ============================================================= end
In almost all the assignments, you are not required to implement the code to visualize your data either. To reinforce your learning, all the code to load data, visualize data and to explain you the machine learning concepts is already provided in the beginning. The screenshot below depicts some polynomial fit and the visualization of the error of the train and cross validation set in one of the exercises.
Submission of the programming assignments to Coursera
When you have completed the assignment, you will upload your code to the Coursera server (through a submit script) and it will be checked using some kind of unit tests. You get instant feedback if your solution is right or wrong and sometimes you even get hints what might be incorrect. When your solution is correct, you earn the points for that part of the assignment.
Statement of accomplishment
If you have completed 80% of the multiple choice and programming assignments successfully, you are entitled for a statement of accomplishment. Important to say, you will not get any credits from Stanford university for this class; but you will get a signed statement of accomplishment from professor Andrew Ng. If you are solely interested in learning the foundations of machine learning (as I was) and do not aim for official university credits, this free Coursera class (and many others) are a great way to life long learning.
I am happy that I successfully completed this class! As a proof. I am proud to present you my letter of accomplisment for the class:
Machine Learning Statement of Accomplishment
If I got you interested, you should sign up quickly for this class since the next session starts on 3rd March 2014.