Hi, so if you have read this blog, then you probably know that I had recently applied and got a financial aid from Coursera on Andrew Ng's Machine Learning Course .
I have just completed it's first week and in this blog I will share what I have learnt throughout the period so that if someday you decide to get into the course then you don't go blindfolded .
So, before I start going through it I would like to say that this course is literally Machine Learning from scratch because here you will learn the math behind the fancy predictions and thus you will know what actually happens under the hood .And Andrew makes it very easy to understand with his calm and soothing explanations .
So, don't blindly believe, if someone says it is highly mathematically complex.
The INTRO Section :
So, in the first section you will be welcomed warmly into the world of machine learning and you will learn the introductory materials related to ML like :
What is MACHINE LEARNING ? :
As I have mentioned the question let me give you two answers which I found very beautiful :
" It is the field of study that gives computer the ability to learn without being explicitly programmed ." - Arthur Samuel
Now time for a technical definition :
" A computer program is said to learn from experience E with respect to some class of tasks T and the performance measure P ,if it's performance at tasks in T, as measured by P, improves with experience E. " - Tom Mitchell
Supervised Learning :
In Supervised Learning, the data is well labeled and we know that - for this input that is the output or we have well defined input-output pairs from which the machine could understand the relations and thus will be able to predict when provided a completely new input.
Problems based on Supervised Learning are classified into two types :
Regression :
In regression we predict results within a continuous output or function . For example - Predicting cost of buying a house based on the it's area . It's output is continuous as it is dependent on some variable like in this case, area and the output i.e price continuously varies according to it .
Classification :
Here we predict discrete output or literally classify things like for example - given a picture identify whether it is a dog or cat .
Simple right !
![]() |
Regression (Left) & Clustering (Right) |
Unsupervised Learning :
Her we don't a have a well defined input output pair, so we can't draw any conclusion directly from the data but as we train it we look for structures and patterns in the data and try to cluster them , this is called Clustering which is a type of Unsupervised learning .
For example - Analyzing frequencies or radio signals from outer space to find out patterns and cluster similar segments .
Next you start learning about Models and the math formulae behind prediction .
Model Representation :
This is how a model is represented :
where :
x= the input training data
y= the output of the respective x
h= this is called the Hypothesis Function
The Hypothesis Function, h :

Because if you look at the function closely, the square of all the m hypothesis functions are summed up and divided by m that is how we get the mean of the squares and later we half it by multiplying 1/2 for an efficient Gradient descent about which you will learn in the next section .
Next Andrew Sir, will give two lectures where he will go deep into the cost function where he will explain all the questions that you might have related to this function like why and how it works and thus will try to build an intuition in you .
A small review :
Okay now we have to estimate the parameters in the hypothesis function , this is where gradient descent algorithm comes in and this is how it goes :
θj:=θj−α∂θj∂J(θ0,θ1)
where ,
θ = jth parameter
α= Learning Rate
If you observe the part next to the 'α' it is just the derivative of the cost function or you can also say it's the slope .
So, simply put what this algorithm does is it keeps changing the value of parameters according to the learning rate and thus tries to reach a global minimum value and thus the learning rate plays an important role in this process .
Now that we have a good idea of Gradient Descent lets learn how to apply Gradient descent to Linear Regression .
Applying Gradient Descent To Linear Regression :
This just means that we substitute the linear regression cost function in the gradient descent equation thus this is what we get as a result :
The part next to the learning rate is nothing but what we get when we differentiate the cost function with respect to theta. Common let's do some math :

That is all what you learn in Week 1 of Andrew Ng's Machine Learning Stanford Course .
nice post man! keep up the good work!
ReplyDeleteSure man, Thanks!! .BTW I will be going through all the weeks, you can Follow by your mail to and get notified when I hit publish !!
Delete