Data Science – [ Machine Learning] using R / Python and Visualization using Tableau

Contact us for Demo Lecture.

Detailed Topic Description:

  1. Descriptive Statistics Introduction to the course Descriptive Statistics Probability Distributions
  • Types of data
  • Measures of Central Tendency
  • Measures of Variance
  • Probability Rules
  • Probability Distributions: Normal Distribution/ Binomial Distribution/Poison Distribution
  • Estimations and Proportions
  1. Inferential Statistics Inferential Statistics through hypothesis tests Permutation & Randomization Test
  • Hypothesis Testing Basics
  • Error Types
  • Hypothesis test for one sample population mean
  • Hypothesis test for two sample population mean
  • Paired T test
  • Test for Variance / correlation
  • Confidence Intervals
  • Chi-sq test
  1. Analysis of variance
  • One way Analysis of Variance
  • Two way Analysis of Variance
  1. Basics of R Programming
  • How to install R , R studio
  • R Data Types , variables and operators
  • R Data Frame , Basic functions in R
  • Subletting, merging, recoding , aggregating , ordering , binding data in R
  • User Defined functions
  • Packages in R
  • Apply(), Lapply(),Sapply(),tapply() R
  • Missing Value Imputation in R
  • Removing Outliers from data
  1. Machine Learning: Introduction and Concepts Differentiating algorithmic and model based frameworks Regression: Ordinary Least Squares in R
  • Difference between Supervised and Unsupervised Learning
  • Visualization techniques
  • To formulate simple and multiple regression models
  • To give an account of the principle of least squares
  • To carry out tests of linear hypothesis
  • To perform validation of a regression model
  • To perform Cross Validation / Stepwise Regression
  • To select the important explanatory variables
  • To use R for analyzing real data sets
  • To be able to interpret the results in practical examples.
  1. Binary Logistic Regression in R
  • The log odds ratio Transformation
  • Logistic models and Logit models
  • Implementation of BLR in R
  • Interpretation of results
  • ROC curve
  • Confidence Matrix
  1. K nearest Neighbors Regression & Classification in R
  • KNN algorithm
  • Distance Measure Methods
  • Maximum Vote Concept
  • KNN Regression in R
  • KNN Classification in R

     8 . Decision Tree and Random Forest in R

  • Classification Trees
  • Regression Trees
  • Regularization and pruning
  • Ensemble Models
  • Bagging / Boosting / Out of Bag Error
  • Random Forest and Decision Tree implementation in R
  1. Support Vector Machine
  • Construction of SVM
  • Support Vector and Hyperplanes
  • Kernel Trick
  • Hard vs Soft Margin SVM
  • Implementation and Result Interpretation in R
  1. Naïve Bayes
  • Conditional Probability
  • Bayes Theorem
  • Implementation and Result Interpretation in R
  1. Introduction To Artificial Neural Network
  • Working of ANN
  • Similarity between ANN and biological Neural System
  • Single and Multilayer networks
  1. Unsupervised Machine Learning
  • Intra cluster and inter cluster analysis
  • Hierarchical Clustering
  • K means Clustering
  1. Dimension Reduction Technique
  • Principle Component Analysis
  • Principle Components Regression
  1. Association Rule Mining
  • Market Basket Analysis
  • Understanding Support, confidence, lift
  • Implementation and interpretation of Market Basket Analysis in R


  1. Introduction and Setting Up Your Integrated Analysis Environment
  • Python Shell
  • Custom environment settings
  • Jupyter Notebooks
  • Script editor
  • Packages: NumPy, SciPy, scikit-learn, Pandas, Matplotlib, Seaborn, etc.
  1. Using Python to Control and Document Your Data Science Processes
  • Python Essentials
  • Data types and objects
  • Loading packages, namespaces
  • Reading and writing data
  • Simple plotting
  • Control flow
  • Debugging
  • Code profiling
  1. Accessing and Preparing Data
  • Loading from CSV files
  • Accessing SQL databases
  • Cleansing Data with Python
  • Stripping out extraneous information
  • Normalizing data
  • Formatting data


  1. Numerical Analysis, Data Exploration, and Data Visualization with NumPy Arrays, Matplotlib, and Seaborn
  • NumPy Essentials
  • The NumPy array
  • N-dimensional array operations and manipulations
  • Memory mapped files
  • Data Visualization
  • 2D plotting with Matplotlib
  • Advanced data visualization with Seaborn

    19 . V. Exploring Data with Pandas

  • Searching for Gold in a Pile of Pyrite
  • Data manipulation with Pandas
  • Statistical analysis with Pandas
  • Time series analysis with Pandas
  1. Machine Learning with scikit-learn
  • Predicting the Future Can Be Good for Business
  • Input: 2D, samples, and features
  • Estimator, predictor, transformer interfaces
  • Pre-processing data
  • Regression
  • Classification
  • Model selection
  1. Data Visualization using Tableau


    For Further Details and Enrollment mail to