English | 2016 | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 5.5 Hours | 2.14 GB

Over 100 hands-on tasks to help you effectively solve real-world data problems using the most popular R packages and techniques

R is a data analysis software as well as a programming language. Data scientists, statisticians and analysts use R for statistical analysis, data visualization and predictive modeling. R is open source and allows integration with other applications and systems. Compared to other data analysis platforms, R has an extensive set of data products. Problems faced with data are cleared with R’s excellent data visualization feature.

The first section in this course deals with how to create R functions to avoid the unnecessary duplication of code. You will learn how to prepare, process, and perform sophisticated ETL for heterogeneous data sources with R packages. An example of data manipulation is provided, illustrating how to use the ‘dplyr’ and ‘data.table’ packages to efficiently process larger data structures. We also focus on ‘ggplot2’ and show you how to create advanced figures for data exploration.

In addition, you will learn how to build an interactive report using the “ggvis” package. Later sections offer insight into time series analysis, while there is detailed information on the hot topic of machine learning, including data classification, regression, clustering, association rule mining, and dimension reduction.

By the end of this course, you will understand how to resolve issues and will be able to comfortably offer solutions to problems encountered while performing data analysis.

What You Will Learn

- Get to know the functional characteristics of R language
- Extract, transform, and load data from heterogeneous sources-
- Understand how easily R can confront probability and statistics problems
- Get simple R instructions to quickly organize and manipulate large datasets
- Create professional data visualizations and interactive reports
- Predict user purchase behavior by adopting a classification approach
- Implement data mining techniques to discover items that are frequently purchased together
- Group similar text documents by using various clustering methods

## Table of Contents

**Functions in R**

1.R Functions and Arguments

2.Understanding Environments

3.Working with Lexical Scoping

4.Understanding Closure

5.Performing Lazy Evaluation

6.Creating Infix Operators

7.Using the Replacement Function

8.Handling Errors in a Function

9.The Debugging Function

**Data Extracting, Transforming, and Loading**

10.Downloading Open Data

11.Reading and Writing CSV Files

12.Scanning Text Files

13.Working with Excel Files

14.Reading Data from Databases

15.Scraping Web Data

**Data Pre-Processing and Preparation**

16.Renaming the Data Variable

17.Converting Data Types

18.Working with Date Format

19.Adding New Records

20.Filtering Data

21.Dropping Data

22.Merging and Sorting Data

23.Reshaping Data

24.Detecting Missing Data

25.Imputing Missing Data

**Data Manipulation**

26.Enhancing a data.frame with a data.table

27.Managing Data with data.table

28.Performing Fast Aggregation with data.table

29.Merging Large Datasets with a data.table

30.Subsetting and Slicing Data with dplyr

31.Sampling Data with dplyr

32.Selecting Columns with dplyr

33.Chaining Operations in dplyr

34.Arranging Rows with dplyr

35.Eliminating Duplicated Rows with dplyr

36.Adding New Columns with dplyr

37.Summarizing Data with dplyr

38.Merging Data with dplyr

**Visualizing Data with ggplot2**

39.Creating Basic Plots with ggplot2

40.Changing Aesthetics Mapping

41.Introducing Geometric Objects

42.Performing Transformations

43.Adjusting Scales

44.Faceting

45.Adjusting Themes

46.Combining Plots

47.Creating Maps

**Making Interactive Reports**

48.Creating R Markdown Reports

49.Learning the Markdown Syntax

50.Embedding R Code Chunks

51.Creating Interactive Graphics with ggvis

52.Understanding Basic Syntax and Gramma

53.Controlling Axes and Legends and Using Scales

54.Adding Interactivity to a ggvis Plot

55.Creating an R Shiny Document

56.Publishing an R Shiny Report

**Simulation from Probability Distributions**

57.Generating Random Samples

58.Understanding Uniform Distributions

59.Generating Binomial Random Variates

60.Generating Poisson Random Variates

61.Sampling from a Normal Distribution

62.Sampling from a Chi-Squared Distribution

63.Understanding Student-s t- Distribution

64.Sampling from a Dataset

65.Simulating the Stochastic Process

**Statistical Inference in R**

66.Getting Confidence Intervals

67.Performing Z-tests

68.Performing Student-s t-Tests

69.Conducting Exact Binomial Tests

70.Performing Kolmogorov-Smirnov Tests

71.Working with the Pearson-s Chi-Squared Tests

72.Understanding the Wilcoxon Rank Sum and Signed Rank Tests

73.Conducting One-way ANOVA

74.Performing Two-way ANOVA

**Rule and Pattern Mining with R**

75.Transforming Data into Transactions

76.Displaying Transactions and Associations

77.Mining Associations with the Apriori Rule

78.Pruning Redundant Rules

79.Visualizing Association Rules

80.Mining Frequent Itemsets with Eclat

81.Creating Transactions with Temporal Information

82.Mining Frequent Sequential Patterns with cSPADE

**Time Series Mining with R**

83.Creating Time Series Data

84.Plotting a Time Series Object

85.Decomposing Time Series

86.Smoothing Time Series

87.Forecasting Time Series

88.Selecting an ARIMA Model

89.Creating an ARIMA Model

90.Forecasting with an ARIMA Model

91.Predicting Stock Prices with an ARIMA Model

**Supervised Machine Learning**

92.Fitting a Linear Regression Model with lm

93.Summarizing Linear Model Fits

94.Using Linear Regression to Predict Unknown Values

95.Measuring the Performance of the Regression Model

96.Performing a Multiple Regression Analysis

97.Selecting the Best-Fitted Regression Model with Stepwise Regression

98.Applying the Gaussian Model for Generalized Linear Regression

99.Performing a Logistic Regression Analysis

100.Building a Classification Model with Recursive Partitioning Trees

101.Visualizing Recursive Partitioning Tree

102.Measuring Model Performance with a Confusion Matrix

103.Measuring Prediction Performance Using ROCR

**Unsupervised Machine Learning**

104.Clustering Data with Hierarchical Clustering

105.Cutting Tree into Clusters

106.Clustering Data with the k-means Method

107.Clustering Data with the Density-Based Method

108.Extracting Silhouette Information from Clustering

109.Comparing Clustering Methods

110.Recognizing Digits Using the Density-Based Clustering Method

111.Grouping Similar Text Documents with k-means Clustering Method

112.Performing Dimension Reduction with Principal Component Analysis (PCA)

113.Determining the Number of Principal Components Using a Scree Plot

114.Determining the Number of Principal Components Using the Kaiser Method

115.Visualizing Multivariate Data Using a biplot

Resolve the captcha to access the links!