Anirudh Kala

24 Feb , 2015  

Anirudh Kala is the Cofounder of Celebal Corp and possess expertise around various Machine Learning techniques and Big Data Management technologies. He has experience around multiple domains ranging across Life Sciences , Finance and Retail.



  • Machine Learning
  • Artificial Intelligence
  • R , Python
  •  Spark ,Cassandra, Kafka, Elastic Search, Tachyon, memsql,  Hive , Pig , Mahout, Flume.
  • Open NLP, Stanbol, UIMA, Solr, Lucene, Nutch


Data Science,Deep Learning,Neural Network

How Celebal Leverages Deep Learning to Solve Business Problems

17 Oct , 2016  

Deep Learning – is yet another buzz word and is the successor of Machine Learning and now a predecessor to Artificial Intelligence or simply put – AI. The intriguing thing is the number of articles that exist in our enterprise ecosystem, all emphatically pressing on the implementation of deep learning to solve real world problems. However most of them restricts themselves to uncomfortable algebra and calculus which too seems to be “inspired” from Stanford Courses (In case you are not aware of those, try googling CS231n and CS224d, they are awesome)

The challenging thing however is the capability to solve real “Business” problems. I quoted business because there is plethora of tutorials using Deep Learning to solve problems like Computer Vision for Auto driven cars or someone More…


Data Science,Random Forest,Uncategorized

Tuning Random Forest

8 Feb , 2016  

Hi there , This blog Post will tell you how to tune R using different techniques, so lets get started:

We Would use the Sonar Dataset. This is the data set used by Gorman and Sejnowski in their study of the classification of sonar signals using a neural network. The task is to train a network to discriminate between sonar signals bounced off a metal cylinder and those bounced off a roughly cylindrical rock.

Each pattern is a set of 60 numbers in the range 0.0 to 1.0.


Data Science,Mooc

How to become a data scientist

25 Jun , 2015   Video

This is one of the webinars that I took to educate students, industry professionals who were looking to make a shift in their career and that too in Data Science. Data Science is a combination of Statistics , Math and Programming. For more details please check out the video. The journey to become a data scientist has many milestones and it is important to reach each one of them one by one. Here are some of the first few steps for getting started:

  • Andrew NG Course by Stanford University for machine learning.
  • Mit OpenCourseWare for Statistics
  • Python / R from Udacity.

Also there multiple ways / source to improve your newly acquired skills :

  • TopCoder : Crowd Sourcing Platform, It runs the real time problems of Organisations for whom the solution lies in and around Data Science
  • Kaggle : Another Crowd Sourcing platform with more generic problems.
  • UCI Machine Learning Repository : Repositry for multiple Data Sets

Time Series

ARIMA Time Series Modeling

26 Feb , 2015  

<br />



Forecasting based on ARIMA (autoregressive integrated moving averages) models, commonly know as the Box–Jenkins approach, comprises following stages:

i.) Model identification ii.) Parameter estimation

iii.) Diagnostic checking

These stages are repeated until a “suitable” model for the given data has been identified (e.g. for prediction). The following three sections show some facilities that R offers for assisting the three stages in the Box–Jenkins approach.

A first step in analyzing time series is More…

, ,

Neural Network

Convolutional neural networks

19 Feb , 2015  

Neural networks have been around for a number of decades now and have seen their ups and downs. Recently they’ve proved to be extremely powerful for image recognition problems. Or, rather, a particular type of neural network called a convolutional neural network has proved very effective. In this post, I want to build off of the series of posts I wrote about neural networks a few months ago, plus some ideas from my post on digital images, to explain the difference between a convolutional neural network and a classical (is that the right term?) neural network.

First, let me quickly review the idea behind a neural network: We start with a collection of neurons, each of which takes a collection of input values and uses them to calculate a single output value. Then we hook them all together, so that the inputs to each neuron are attached to either the outputs of other neurons or to coordinates/features of a data point that is fed into the network.

When you input a data point into neural network, the outputs of the first level of neurons are calculated, then they feed More…

, ,

Dimensionality Reduction

Principal component analysis (PCA)

19 Feb , 2015  

Why PCA?

A principal component analysis is a way to reduce dimensionality of a data set consisting of numeric vectors to a lower dimensionality. Then it is possible to visualize the data set in three or less dimensions. This is analogous to lowering down the Rank of the Matrix which means that we decompose the Matrix into lower order one such that there is no more linear  dependency of one feature , on the other features or combination of features. The algorithm

  1. From every matrix element of P we subtract the mean of every element located in the same column. This new matrix we name P’. Thas is called mean-correction.
  2. More…

, ,