Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Day #32

October 01, 2016

A large coefficient will result in overfitting. To avoid we perform regularization. Regularization - To avoid overfitting

L1 - Sum of values (Lasso - Least absolute shrinkage and selection operator). L1 will be meeting in co-ordinates and result in one of the dimensions zero. This would result in variable elimination. The features that minimally contribute will be ignored.
L2 - Sum of squares of values (Ridge). L2 is kind of circle shaped. This will shrink all coefficient in same proportion but eliminate none
Discriminative - In SVM we use hyperplane to classify the classes. This is example for discriminative approach
Probabilistic - Generated by Gauss Distribution. This is again based on Central Limit Theorem. Infinite points will fit into a Normal distribution. Here we apply gauss distribution model
Max Likelihood - Probability that the point p belongs to one distribution.

Good Read for L2 - Indeed, using the L2 loss comes from the assumption that the data is drawn from a Gaussian distribution

Another Read -

L1 Loss function minimizes the absolute differences between the estimated values and the existing target values. L1 loss function is more robust and is generally not affected by outliers
L2 loss function minimizes the squared differences between the estimated and existing target values. L2 error will be much larger in the case of outliers

Happy Learning!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)