"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

March 31, 2017

Day #60 - TSQL Profiling - Expressprofiler

Way better and Less complicated than SQL profiler
  • Profiler by DB Name
  • Profile by login account name
These two options are good enough to nail down most of issues. For blocking / deadlock we can hop on to Profile. Basic Checks this tool beats the need



Link - Download

Jasper Report passing parameter between datasets

Happy Learning!!!

Day #59 - Image Object Classification using Keras

This post is for basic image classification in Keras using VGG19. We leverage pre-trained models to detect objects in the image


Happy Learning!!!

March 20, 2017

Day #58 - Hacker Earth Challenge

With Running projects, its bit challenging to manage multiple tasks. Bookmarking my thoughts until further analysis

Problem - Link

Data Analysis (Approach)
  • Load data in SQL Tables
  • Analyze each column, Continous or Discrete variables
  • Outliers, missing data, summary of each Data Column
  • Manage Class Imbalances
  • Convert the dataset into numeric columns
  • Ignore any non-critical columns
  • Identify Data Correlations if it exists (Pending task)
The Approach
1. To eliminate class imbalance used smote technique
2. Used XGBoost to train and predict
3. Python 2.7 used. Two files one for Data cleanup, second for prediction

Happy Learning!!!

Day #57 - Xgboost on Windows 7, Python 2.7

On Python 2.7 Installed xgboost with below steps on Windows 7. This link was useful

1. Step 1 - Search for packages
anaconda search -t conda xgboost

2. Install Windows compatabile package
conda install -c mndrake xgboost

On Python3 Win64 Windows

conda install -c jjhelmus r-xgboost-cpu

conda install -c mikesilva xgboost


conda install -c rdonnelly py-xgboost

Happy Learning!!!

March 15, 2017

error: Unable to find vcvarsall.bat

While 'im2col_cython', encountered thus error on Windows 7. There were several solutions provided but below option only worked for me.

building 'im2col_cython' extension
error: Unable to find vcvarsall.bat

This post was useful to fix it

Step 1 - Installed Microsoft Visual C++ Compiler for Python 2.7 from https://www.microsoft.com/en-in/download/details.aspx?id=44266

Step 2 - In Directory C:\Anaconda2\Lib\distutils, Modified msvc9compiler.py file as below

Compiled as below
python.exe setup.py build_ext --inplace --compiler=msvc

Happy Learning!!!

March 14, 2017

TSQL Date Dimension Data Population Script

This post on DateDimension Data Population script. Back to TSQL Days...........

Happy Learning!!!

March 10, 2017

Startup Ideas

Idea #1 - Connecting the Dots in Travel Experience - Today we have cab aggregator, bus aggregator platforms, payments, etc. These are still working in silos. There is a bigger need to integrate them and providing more seamless travel experience. Example - (Uber + Indigo + Paytm) similar tie-ups would help to integrate booking - pickups - travel and drop to the final destination. This end to end solution would reduce cab wait times, better-handled delays/wait times, better travel experience for the customer.

Idea #2 - Analytics in WindEnergy / Solar Energy for better Energy Harvest - Adjust the directions of rotations according to wind/weather forecasts. Changing panel directions based on weather forecasts, wind etc

Happy Ideas!!!

IoT Training

It would be useful to do a few student projects to understand the fundamentals and hands-on training. This is next learning plan from June. Bookmarking a few interesting labs.
Happy Learning!!!

March 03, 2017

Day #56 - Deep Learning Class #3 Notes - Training Deep Networks

Part I
Parameters Overview
  • Multi Layer Perceptron (Mean Square Error, Weight Decay Term - prevent overfitting, regularization term)
  • Error function non-convex loss function - Gradient Descent
  • Saddle points (Minimum along few dimensions, maximum along other dimensions), They can be problem in Deep networks. Train Deep networks to avoid saddle points
  • Vanishing Gradient Problem - Identify weights, value becomes small updates become slow. Chain rule in backpropagation. Gradient update that reaches earlier layers will be very low
  • Alexnet had only seven layers, Best of network use only 7~8 layers
  • Exploding gradient, terminate the value when it is greater than threshold
  • Mini Batch SD - Batches of GD (20 / 100 points), Average 20 points and update all layers in network. SGD with batch size
  • Iteration - Whenever a weight update is done
  • Epoch - Whenever training set is used once
  • Momentum - During GD, idea is example of blind person navigating mountain range
  • Momentum(Useful - Find the local minima or some other local minima or better minima, Highly Elliptical momentum useful /Not Useful - For shperical contour plot)
  • Spherical - Normal will take you directly to centre
  • Contour Plot - cross section of mountain
  • Nesterov Momentum - One step further in direction of step we need to take, Works very well in practice (Intermin update, compute weights)
Part II
Choosing Activation Function
  • Different Activation functions Sigmoid, tanh, Relu, Leaky Relu, maxout
  • Sigmoid (Between 0 and 1) - By Applying sigmoid this range will be reduced, Zero output will be eliminated by tanH. Brings non-linearity in network
  • TanH - Belong to logistics family of functions (-1 to +1), Used even today
  • Relu - Most popular - Rectified Linear Unit, Actually Linear on +ve Side, Negative side Zero. Max(0, x). For -ve output Relu will make it Zero. Relus is linear on positive axis, Default for images and Videos
  • Leaky Relu - y = x if x > 10, Going to let small amount of x pass thru
  • maxout - Given layer of neurons, groups of 10. Max of batch of neurons is the maximum
  • Softmax - Ensure each activation lies between 0 and 1
  • Hierarchical Softmax - Requires word2vec
  • Relu mostly for images and videos, If too many dead units try leaky Relu
  • RNN, LSTM still sigmoid and tanH is used
Part III
Choosing Loss Function
  • Loss Functions / Cost both mean the same
  • MSE - Gradient is simple
  • Cross Entropy Loss function
  • Entropy - -SumPiLogPi
  • Binary cross Entropy
  • Negative Log Likelihood
  • Softmax for binary same as Sigmoid
  • Start with NLL, Minimise NLL given particular activation function
  • KLDivergence measure distance between two distributions
Part IV
Choosing Learning Rate
  • Convex function GD will always take you to local minimum
  • Always GD reached in one step for correct learning rate
  • Hessian - second derivative matrix (Optimal learning rate - inverse of Hessian)
  • Gradient is a vector not a single value
  • Optimal Learning rate - Eigen values of Hessian
  • Adaptive is best approach to chose learning rate
  • Adagrad is one such method
  • Slow on Steep clif, On Flat Surface long long approach
  • RMSProp - Root Mean Square Prop 
  • Adam - Most popular method today (Default)
  • Momentum + Current Gradient - Adam
  • AdaDelta another similar method
  • To chose - http://sebastianruder.com/optimizing-gradient-descent
  • SGD - Mini batch SGD
  • Choice for Training - SGD + Nestrov momentum, SGD with Adagrad / RMSProp / Adam
Part V
  • Math of Backpropagation
  • Backpropagation uses GD
  • Issues with GD / Training GD
  • Using Learning Rate / Optimization
Part VI - Regularization
  • Training DL is GD
  • ML and Optimization difference is Generalization
  • ML best performance tomorrow (Work well tomorrow, generalization is important)
  • Regularization methods incorporated for Generalization performance
  • Training accuracy increases, Test Accuracy Decreases (Point to stop)
Part VII - When to Stop
  • Train Epoch, Lower learning rate and again Train
  • Maxweight change less than particular row
  • Weight decay term in Error function itself
  • L2 Weight Decay (Add Square of weights)
  • L1 Weight Decay (Absolute Value of Weights), Sparse Solutions
  • Drop Out - In each iteration, In each mini batch, In every layer randomly drop certain % of nodes. This gives excellent regularzation performance
  • Ensemble of different models 9Similarity of Random Forests)
  • DropConnect (extension of Drop out)
  • Add Noise (Data Noise) - Gaussian, Salt and Pepper Noise
  • Batch Normalization Layer (Recommended) - All implemented libraries 
  • Shuffle your inputs
  • Choose mini-batch such that network learns faster
Curriculum Learning
  • Provide slides and figure the course out
  • Lots of data + Lots of computing for Deep Learning Success (Google / FB)
  • Unsupervised Learning is approach by Facebook for Data Analysis
  • Data Programmatically - NIPS Machine Learning Conference
  • Data Augmentation (Change illumation in data, Reduce intensity of pixels, Train Network with all kinds of data - Mirror, Noise, Artificial Images)
Target Values
  • Binary classification problem ? +1 and -1
Weight Initialization
  • GD works and takes you to different local minima
  • Starting defined by how you initialize the network
  • Never Initialize to Zero
  • Recommended ways - Xaviers Initialization
  • For every layer in network get weights randomly from uniform distribution

Happy Learning!!!