Data science

Data Science Course Curriculum

Data science Introduction
  • Data Science motivating examples -- Nate Silver, Netfilx, Money ball, okcupid, LinkedIn,
  • Introduction to Analytics, Types of Analytics,
  • Introduction to Analytics Methodology
  • Analytics Terminology, Analytics Tools
  • Introduction to Big Data
  • Introduction to Machine Learning
R software:
1. Introduction and Overview of R Language :
  • Origin of R, Interface of R,R coding Practices
  • R Downloading and Installing R
  • Getting Help on a function
  • Viewing Documentation
2 Data Inputting in R Data Types
  • Data Types, Data Objects, Data Structures
  • Creating a vector and vector operations
  • Sub-setting
  • Writing data
  • Reading tabular data files
  • Reading from csv files
  • Initializing a data frame
  • Selecting data frame cols by position and name
  • Changing directories
  • Re-directing R output
3 Data Manipulation in R
  • Appending data to a vector
  • Combining multiple vectors
  • Merging data frames
  • Data transformation
  • Control structures
  • Nested Loops
splitting
  • Strings and dates
  • Handling NAs and Missing Values
  • Matrices and Arrays
  • The str Function
  • Logical operations
  • Relational operators
  • generating Random Variables
  • Accessing Variables
  • Matrix Multiplication and Inversion
  • Managing Subset of data
  • Character manipulation
  • Data aggregation
  • Subscripting
Functions and Programming in R
  • Flow Control: For loop
  • If condition
  • While conditions and repeat loop
  • Debugging tools
  • Concatenation of Data
  • Combining Vars, cbind, rbind
  • sapply, lapply, tapply functions
Basic Statistics in R :
Part-I Session 1
  • Descriptive Statistics Introduction to Advanced Data Analytics
  • Statistical inferences for various Business problems
  • Types of Variables, measures of central tendency and dispersion
  • Variable Distributions and Probability Distributions
  • Normal Distribution and Properties
  • Computing basic statistics
  • Comparing means of two samples
  • Testing a correlation for significance
  • Testing a proportion
  • Classical tests (t,z,F)
  • ANOVA
  • Summarizing Data
  • Data Munging Basics
Part-I Session 2
  • Test of Hypothesis Null/Alternative Hypothesis formulation 7
  • One Sample, two sample (Paired and Independent) T/Z Test
  • P Value Interpretation
  • Analysis of Variance (ANOVA)
  • Non Parametric Tests (Chi-Square, Kruskal-Wallis, Mann-Whitney.)
Part-I Session 3
  • Introduction to Correlation - Karl Pearson
  • Spearman Rank Correlation
Advanced Analytics with real world examples (Mini Projects)Part-II Session 1
  • Regression Theory
  • Linear regression
  • Logistic Regression Non Linear Regressions using Link functions
  • Logit Link Function
  • Binomial Propensity Modeling
  • Training-Validation approach
Part-II Session 2
  • Factor Analysis Introduction to Factor Analysis – PCA
  • Reliability Test 4
  • KMO MSA tests, Eigen Value Interpretation
  • Factor Rotation and Extraction
Part-II Session 3
  • Cluster Analysis Introduction to Cluster Techniques
  • Distance Methodologies
  • Hierarchical and Non-Hierarchical Procedures
  • K-Means clustering
  • Wards Method
Time Series AnalysisPart-III Session 1
  • Introduction and Exponential Smoothening Introduction to Time Series Data and
Analysis
  • Decomposition of Time Series
  • Trend and Seasonality detection and forecasting
  • Exponential Smoothing (Single, double and triple)
Part-III Session 2
  • ARIMA Modeling Box - Jenkins Methodology
  • Introduction to Auto Regression and Moving Averages, ACF, PACF
Data Mining : Machine learning with R:Part IV Session 1
  • Introduction to Machine learning and various machine learning techniques
  • Introduction to Data Mining
  • Introduction to Text Mining
  • Text analytic Process
  • Sentiment Analysis
Part IV
  • Statistical Analysis & Data Mining/Machine Learning
  • Cluster Analysis using R-Rattle
  • Association Rule Mining
  • Predictive Modeling using Decision Trees
  • Supervised learning
  • Un- Supervised learning
  • Reinforcement learning
  • Neural Network
  • Support Vector machine
Part IV Session 3
  • Evaluating & Deploying Models Evaluating performance of Model on Training and Validation data
  • ROC, Sensitivity, Specificity, Lift charts, Error Matrix
  • Deploying models using Score options
  • Opening and Saving models using Rattle
Analytics in Excel - 3 days
  • Data Preparation and Data Exploration in Excel
  • Network Analysis using NodeXL
Data Visualization in R
  • Creating a bar chart, dot plot
  • Creating a scatter plot, pie chart
  • Creating a histogram and box plot
  • Other plotting functions
  • Plotting with base graphics
  • Plotting with Lattice graphics
  • Plotting and coloring in R


4 comments:

  1. Thank you for providing useful information and this is the best article blog for the students.learn Python programming training course.
    Python Training in Hyderabad

    ReplyDelete
  2. Thank you for your post. This is excellent information. It is amazing and wonderful to visit your siteData Science Online Training

    ReplyDelete
  3. I ‘d mention that most of us visitors are endowed to exist in a fabulous place with very many wonderful individuals with very helpful things.
    let view our IT training website helpful for all students
    data science Training in chennai | best data science training class in chennai | data science course in chennai

    ReplyDelete
  4. Your very own commitment to getting the message throughout came to be rather powerful and have consistently enabled employees just like me to arrive at their desired goals.
    machine learning Training in chennai | best machine learning training class in chennai | machine learning course in chennai

    ReplyDelete