Sometime last year in October, I decided to learn more about big data, machine learning and predictive analytics. I gave Coursera a try and enrolled in the 10 weeks Machine Learning class** **by Prof Andrew Ng. from Stanford University [1-4]. Prof Ng. is one of the world renowned experts in the field of machine learning, the director of the Stanford AI Lab, a truly amazing teacher and one of the co-founder of Coursera.

For those who do not know Coursera: Coursera is an educational technology company which is offering free massive open online courses. It has cooperations with universities all around the globe and offers courses in computer science, engineering, physics, humanities, medicine, biology, social sciences, mathematics and business.

As I said, the machine learning class is completely free and held online. It consists of video lectures, multiple choice assignments and programming exercises. The workload was estimated to be around 5-7 hours a week, but you can easily spend more time as soon as you dig any deeper into the material. In this article I want to share my experiences with you, and highly recommend you to enroll in the next session which starts in around two weeks.

**Syllabus: What will you learn?**

The lectures are split thematically into chunks of 8-15 minutes videos and the following topics will be discussed in the course:

Linear Regression with One Variable

Linear Regression with Multiple Variables

Logistic Regression

Neural Networks: Representation

Neural Networks: Learning

Advice for Applying Machine Learning

Machine Learning System Design

Support Vector Machines (SVMs)

Clustering

Dimensionality Reduction

Anomaly Detection

Recommender Systems

Large-Scale Machine Learning

For every topic there is a multiple choice assignment which needs to be handed in on a weekly basic. The assignments can be handed in as often as you want, as long as they are submitted within the deadline. The multiple choice assignments are not very difficult: You should complete them to check your understanding of the materials. The far more challenging part are the programming assignments.

**Programming assignments**

For the programming assignments, you can either choose to use Matlab [5] or Octave (a free and open-source Matlab alternative [6]). If you want to use Matlab, you can even get a Free Student License from Stanford for the duration of the class. A programming assignments always consists of a set of data relevant for the exercise and some surrounding code which calls the methods you are about to implement. You are then just required to implement a specific part of an algorithm (a method). For example the function *projectData(X, U, K)* below was part of the exercise on Principal Component Analysis and performs a projection of data into a lower dimensional space:

function Z = projectData(X, U, K) %PROJECTDATA Computes the reduced data representation when projecting only %on to the top k eigenvectors % Z = projectData(X, U, K) computes the projection of % the normalized inputs X into the reduced dimensional space spanned by % the first K columns of U. It returns the projected examples in Z. % % You need to return the following variables correctly. Z = zeros(size(X, 1), K); % ====================== YOUR CODE HERE ====================== % Instructions: Compute the projection of the data using only the top K % eigenvectors in U (first K columns). % For the i-th example X(i,:), the projection on to the k-th % eigenvector is given as follows: % x = X(i, :)'; % projection_k = x' * U(:, k); % Ureduce = U(:,1:K); for i=1:size(X,1), % you do stuff, which I removed not to spoil the solution... end % ============================================================= end

In almost all the assignments, you are not required to implement the code to visualize your data either. To reinforce your learning, all the code to load data, visualize data and to explain you the machine learning concepts is already provided in the beginning. The screenshot below depicts some polynomial fit and the visualization of the error of the train and cross validation set in one of the exercises.

**Submission of the programming assignments to Coursera**

When you have completed the assignment, you will upload your code to the Coursera server (through a submit script) and it will be checked using some kind of unit tests. You get instant feedback if your solution is right or wrong and sometimes you even get hints what might be incorrect. When your solution is correct, you earn the points for that part of the assignment.

Statement of accomplishment

Statement of accomplishment

If you have completed 80% of the multiple choice and programming assignments successfully, you are entitled for a statement of accomplishment. Important to say, you will not get any credits from Stanford university for this class; but you will get a signed statement of accomplishment from professor Andrew Ng. If you are solely interested in learning the foundations of machine learning (as I was) and do not aim for official university credits, this free Coursera class (and many others) are a great way to life long learning.

I am happy that I successfully completed this class! As a proof. I am proud to present you my letter of accomplisment for the class:

Machine Learning Statement of Accomplishment

If I got you interested, you should sign up quickly for this class since the next session starts on

**3rd March 2014.**

**Links**

[1] http//www.coursera.org/

[2] http://www.coursera.org/course/ml

[3] http://cs.stanford.edu/people/ang/

[4] http://www.stanford.edu/

[5] http://www.mathworks.com/products/matlab/

[6] http://www.gnu.org/software/octave/

thx for summing up the course so well i am doing the course right now and hope to complete it by the end of the month. have skipped the tough neuronal network programming exercises so far (e.g. back prop) and will do them last minute – fingers crossed! 😉

Hi Sissi, thanks for your positive feedback. The neuronal network programming exercises are indeed the most challenging ones throughout the entire lecture.

Good luck!

Marc