Tag Archives for statistics

ROC-Receiver Operating Characteristic

Two days ago, myself and the person whom I had no idea had a healthy conversion on classification models, interesting we were discussing about potential techniques and approaches that should be followed to select an optimal classification model for better results. At some point I suggested him ROC curves for the same reason. Receiver operating […]

What is the difference between Artificial Intelligence, Machine Learning, Statistics, and Data Mining

Few day ago before I saw an interesting question on stats.stackexchange.com that got my attention for a while. After spending few minutes of readings and analyzing all answers on stack I felt writing my thoughts assuming what I would have answered if I really had too. What is the difference between Artificial Intelligence, Machine Learning, Statistics, […]

Mining Associations with Apriori using R – Part 2

Prologue: I have been working and practicing various skills and algorithms as a progress to show on my road-map to become as a matured data scientist. As a part of this expedition I have decided to document all those stuffs I am going through. So whatever you read under this column will be either a summary […]

Mining Associations with Apriori using R – Part 1

Prologue: I have been working and practicing various skills and algorithms as a progress to show on my road-map to become as a matured data scientist. As a part of this expedition I have decided to document all those stuffs I am going through. So whatever you read under this column will be either a summary […]

Calculating Confidence Interval for Classification accuracy

Prologue: I have been working and practicing various skills and algorithms as a progress to show on my road-map to become as a matured data scientist. As a part of this expedition I have decided to document all those stuffs I am going through. So whatever you read under this column will be either be […]

Prediction using Simple liner regression in R – part 2

Prologue: I have been working and practicing various skills and algorithms as a progress to show on my road-map to become as a matured data scientist. As a part of this expedition I have decided to document all those stuffs I am going through. So whatever you read under this column will be either a summary […]

Prediction using Simple liner regression in R – part 1

Prologue: I have been working and practicing various skills and algorithms as a progress to show on my road-map to become as a matured data scientist. As a part of this expedition I have decided to document all those stuffs I am going through. So whatever you read under this column will be either a summary of […]

Association Mining

Objective. To use Apriori algorithm to find frequent itemsets using candidate generation and generate association rule from frequent itemsets . Assumption: Consider a transaction table TID List of Item ids T-1 I1,I2,I5 T-2 I2,I4 T-3 I2,I3 T-4 I1,I2,I4 T-5 I1,I3 T-6 I2,I3 T-7 I1,I3 T-8 I1,I2,I3,I5 T-9 I1,I2,I3 T-10 I4,I5 In the first iteration of […]

Latent Semantic Analysis – Part 2

Preface In  Latent Semantic Analysis – Part 1  i have covered procedure for building Term Document Matrix (TDM) as it is a prerequisite for building LSI model . Now lets see how this TDM is supplied to SVD to obtain U , S, and V matrices. Objective To build a Latent Semantic Analysis (LSA) model […]

Latent Semantic Analysis – Part 1

Objective To build a Latent Semantic Analysis (LSA) model to find statistical synonyms of a word from a huge corpus. Preliminary objective of this post is to build Term Document Matrix (TDM) as it is a prerequisite for building LSI model ; So lets first see how to construct TDM. Observation and legends: In order […]