**The Data Science Course 2020: Complete Data Science Bootcamp**

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 28.5 Hours | 15.2 GB

Complete Data Science Training: Mathematics, Statistics, Python, Advanced Statistics in Python, Machine & Deep Learning

The Problem

Data scientist is one of the best suited professions to thrive this century. It is digital, programming-oriented, and analytical. Therefore, it comes as no surprise that the demand for data scientists has been surging in the job marketplace.

However, supply has been very limited. It is difficult to acquire the skills necessary to be hired as a data scientist.

And how can you do that?

Universities have been slow at creating specialized data science programs. (not to mention that the ones that exist are very expensive and time consuming)

Most online courses focus on a specific topic and it is difficult to understand how the skill they teach fit in the complete picture

The Solution

Data science is a multidisciplinary field. It encompasses a wide range of topics.

- Understanding of the data science field and the type of analysis carried out
- Mathematics
- Statistics
- Python
- Applying advanced statistical techniques in Python
- Data Visualization
- Machine Learning
- Deep Learning

Each of these topics builds on the previous ones. And you risk getting lost along the way if you don’t acquire these skills in the right order. For example, one would struggle in the application of Machine Learning techniques before understanding the underlying Mathematics. Or, it can be overwhelming to study regression analysis in Python before knowing what a regression is.

So, in an effort to create the most effective, time-efficient, and structured data science training available online, we created The Data Science Course 2020.

We believe this is the first training program that solves the biggest challenge to entering the data science field – having all the necessary resources in one place.

Moreover, our focus is to teach topics that flow smoothly and complement each other. The course teaches you everything you need to know to become a data scientist at a fraction of the cost of traditional programs (not to mention the amount of time you will save).

The Skills

1. Intro to Data and Data Science

Big data, business intelligence, business analytics, machine learning and artificial intelligence. We know these buzzwords belong to the field of data science but what do they all mean?

Why learn it? As a candidate data scientist, you must understand the ins and outs of each of these areas and recognise the appropriate approach to solving a problem. This ‘Intro to data and data science’ will give you a comprehensive look at all these buzzwords and where they fit in the realm of data science.

2. Mathematics

Learning the tools is the first step to doing data science. You must first see the big picture to then examine the parts in detail.

We take a detailed look specifically at calculus and linear algebra as they are the subfields data science relies on.

Why learn it?

Calculus and linear algebra are essential for programming in data science. If you want to understand advanced machine learning algorithms, then you need these skills in your arsenal.

3. Statistics

You need to think like a scientist before you can become a scientist. Statistics trains your mind to frame problems as hypotheses and gives you techniques to test these hypotheses, just like a scientist.

Why learn it?

This course doesn’t just give you the tools you need but teaches you how to use them. Statistics trains you to think like a scientist.

4. Python

Python is a relatively new programming language and, unlike R, it is a general-purpose programming language. You can do anything with it! Web applications, computer games and data science are among many of its capabilities. That’s why, in a short space of time, it has managed to disrupt many disciplines. Extremely powerful libraries have been developed to enable data manipulation, transformation, and visualisation. Where Python really shines however, is when it deals with machine and deep learning.

Why learn it?

When it comes to developing, implementing, and deploying machine learning models through powerful frameworks such as scikit-learn, TensorFlow, etc, Python is a must have programming language.

5. Tableau

Data scientists don’t just need to deal with data and solve data driven problems. They also need to convince company executives of the right decisions to make. These executives may not be well versed in data science, so the data scientist must but be able to present and visualise the data’s story in a way they will understand. That’s where Tableau comes in – and we will help you become an expert story teller using the leading visualisation software in business intelligence and data science.

Why learn it?

A data scientist relies on business intelligence tools like Tableau to communicate complex results to non-technical decision makers.

6. Advanced Statistics

Regressions, clustering, and factor analysis are all disciplines that were invented before machine learning. However, now these statistical methods are all performed through machine learning to provide predictions with unparalleled accuracy. This section will look at these techniques in detail.

Why learn it?

Data science is all about predictive modelling and you can become an expert in these methods through this ‘advance statistics’ section.

7. Machine Learning

The final part of the program and what every section has been leading up to is deep learning. Being able to employ machine and deep learning in their work is what often separates a data scientist from a data analyst. This section covers all common machine learning techniques and deep learning methods with TensorFlow.

What you’ll learn

- The course provides the entire toolbox you need to become a data scientist
- Fill up your resume with in demand data science skills: Statistical analysis, Python programming with NumPy, pandas, matplotlib, and Seaborn, Advanced statistical analysis, Tableau,
- Machine Learning with stats models and scikit-learn, Deep learning with TensorFlow
- Impress interviewers by showing an understanding of the data science field
- Learn how to pre-process data
- Understand the mathematics behind Machine Learning (an absolute must which other courses don’t teach!)
- Start coding in Python and learn how to use it for statistical analysis
- Perform linear and logistic regressions in Python
- Carry out cluster and factor analysis
- Be able to create Machine Learning algorithms in Python, using NumPy, statsmodels and scikit-learn
- Apply your skills to real-life business cases
- Use state-of-the-art Deep Learning frameworks such as Google’s TensorFlowDevelop a business intuition while coding and solving tasks with big data
- Unfold the power of deep neural networks
- Improve Machine Learning algorithms by studying underfitting, overfitting, training, validation, n-fold cross validation, testing, and how hyperparameters could improve performance
- Warm up your fingers as you will be eager to apply everything you have learned here to more and more real-life situations

**Table of Contents**

**Part 1 Introduction**

A Practical Example What You Will Learn in This Course

What Does the Course Cover

Download All Resources and Important FAQ

**The Field of Data Science – The Various Data Science Disciplines**

Data Science and Business Buzzwords Why are there so Many

A Breakdown of our Data Science Infographic

Data Science and Business Buzzwords Why are there so Many

What is the difference between Analysis and Analytics

Business Analytics, Data Analytics, and Data Science An Introduction

Continuing with BI, ML, and AI

A Breakdown of our Data Science Infographic

**The Field of Data Science – Connecting the Data Science Disciplines**

Applying Traditional Data, Big Data, BI, Traditional Data Science and ML

**The Field of Data Science – The Benefits of Each Discipline**

The Reason Behind These Disciplines

**The Field of Data Science – Popular Data Science Techniques**

Techniques for Working with Traditional Data

Techniques for Working with Traditional Methods

Real Life Examples of Traditional Methods

Machine Learning (ML) Techniques

Types of Machine Learning

Real Life Examples of Machine Learning (ML)

Techniques for Working with Traditional Data

Real Life Examples of Traditional Data

Techniques for Working with Big Data

Real Life Examples of Big Data

Business Intelligence (BI) Techniques

Real Life Examples of Business Intelligence (BI)

**The Field of Data Science – Popular Data Science Tools**

Necessary Programming Languages and Software Used in Data Science

**The Field of Data Science – Careers in Data Science**

Finding the Job – What to Expect and What to Look for

**The Field of Data Science – Debunking Common Misconceptions**

Debunking Common Misconceptions

**Part 2 Probability**

The Basic Probability Formula

Computing Expected Values

Frequency

Events and Their Complements

**Probability – Combinatorics**

Fundamentals of Combinatorics

Solving Variations without Repetition

Solving Combinations

Symmetry of Combinations

Solving Combinations with Separate Sample Spaces

Combinatorics in Real-Life The Lottery

A Recap of Combinatorics

Fundamentals of Combinatorics

A Practical Example of Combinatorics

Permutations and How to Use Them

Simple Operations with Factorials

Solving Variations with Repetition

Solving Variations without Repetition

**Probability – Bayesian Inference**

Sets and Events

Mutually Exclusive Sets

Dependence and Independence of Sets

The Conditional Probability Formula

The Law of Total Probability

The Additive Rule

The Multiplication Law

Sets and Events

Bayes’ Law

A Practical Example of Bayesian Inference

Ways Sets Can Interact

Intersection of Sets

Union of Sets

Mutually Exclusive Sets

**Probability – Distributions**

Fundamentals of Probability Distributions

Discrete Distributions The Bernoulli Distribution

Discrete Distributions The Binomial Distribution

Discrete Distributions The Binomial Distribution

Discrete Distributions The Poisson Distribution

Discrete Distributions The Poisson Distribution

Characteristics of Continuous Distributions

Characteristics of Continuous Distributions

Continuous Distributions The Normal Distribution

Continuous Distributions The Normal Distribution

Continuous Distributions The Standard Normal Distribution

Fundamentals of Probability Distributions

Continuous Distributions The Standard Normal Distribution

Continuous Distributions The Students’ T Distribution

Continuous Distributions The Students’ T Distribution

Continuous Distributions The Chi-Squared Distribution

Continuous Distributions The Chi-Squared Distribution

Continuous Distributions The Exponential Distribution

Continuous Distributions The Exponential Distribution

Continuous Distributions The Logistic Distribution

Continuous Distributions The Logistic Distribution

A Practical Example of Probability Distributions

Types of Probability Distributions

Types of Probability Distributions

Characteristics of Discrete Distributions

Characteristics of Discrete Distributions

Discrete Distributions The Uniform Distribution

Discrete Distributions The Uniform Distribution

Discrete Distributions The Bernoulli Distribution

**Probability – Probability in Other Fields**

Probability in Finance

Probability in Statistics

Probability in Data Science

**Part 3 Statistics**

Population and Sample

Population and Sample

**Statistics – Descriptive Statistics**

Types of Data

Numerical Variables Exercise

The Histogram

The Histogram

Histogram Exercise

Cross Tables and Scatter Plots

Cross Tables and Scatter Plots

Cross Tables and Scatter Plots Exercise

Mean, median and mode

Mean, Median and Mode Exercise

Skewness

Types of Data

Skewness

Skewness Exercise

Variance

Variance Exercise

Standard Deviation and Coefficient of Variation

Standard Deviation

Standard Deviation and Coefficient of Variation Exercise

Covariance

Covariance

Covariance Exercise

Levels of Measurement

Correlation Coefficient

Correlation

Correlation Coefficient Exercise

Levels of Measurement

Categorical Variables – Visualization Techniques

Categorical Variables – Visualization Techniques

Categorical Variables Exercise

Numerical Variables – Frequency Distribution Table

Numerical Variables – Frequency Distribution Table

**Statistics – Practical Example Descriptive Statistics**

Practical Example Descriptive Statistics

Practical Example Descriptive Statistics Exercise

**Statistics – Inferential Statistics Fundamentals**

Introduction

Central Limit Theorem

Standard error

Standard Error

Estimators and Estimates

Estimators and Estimates

What is a Distribution

What is a Distribution

The Normal Distribution

The Normal Distribution

The Standard Normal Distribution

The Standard Normal Distribution

The Standard Normal Distribution Exercise

Central Limit Theorem

**Statistics – Inferential Statistics Confidence Intervals**

What are Confidence Intervals

Margin of Error

Margin of Error

Confidence intervals. Two means. Dependent samples

Confidence intervals. Two means. Dependent samples Exercise

Confidence intervals. Two means. Independent Samples (Part 1)

Confidence intervals. Two means. Independent Samples (Part 1). Exercise

Confidence intervals. Two means. Independent Samples (Part 2)

Confidence intervals. Two means. Independent Samples (Part 2). Exercise

Confidence intervals. Two means. Independent Samples (Part 3)

What are Confidence Intervals

Confidence Intervals; Population Variance Known; Z-score

Confidence Intervals; Population Variance Known; Z-score; Exercise

Confidence Interval Clarifications

Student’s T Distribution

Student’s T Distribution

Confidence Intervals; Population Variance Unknown; T-score

Confidence Intervals; Population Variance Unknown; T-score; Exercise

**Statistics – Practical Example Inferential Statistics**

Practical Example Inferential Statistics

Practical Example Inferential Statistics Exercise

**Statistics – Hypothesis Testing**

Null vs Alternative Hypothesis

p-value

p-value

Test for the Mean. Population Variance Unknown

Test for the Mean. Population Variance Unknown Exercise

Test for the Mean. Dependent Samples

Test for the Mean. Dependent Samples Exercise

Test for the mean. Independent Samples (Part 1)

Test for the mean. Independent Samples (Part 1). Exercise

Test for the mean. Independent Samples (Part 2)

Test for the mean. Independent Samples (Part 2)

Further Reading on Null and Alternative Hypothesis

Test for the mean. Independent Samples (Part 2). Exercise

Null vs Alternative Hypothesis

Rejection Region and Significance Level

Rejection Region and Significance Level

Type I Error and Type II Error

Type I Error and Type II Error

Test for the Mean. Population Variance Known

Test for the Mean. Population Variance Known Exercise

**Statistics – Practical Example Hypothesis Testing**

Practical Example Hypothesis Testing

Practical Example Hypothesis Testing Exercise

**Part 4 Introduction to Python**

Introduction to Programming

Jupyter’s Interface

Python 2 vs Python 3

Introduction to Programming

Why Python

Why Python

Why Jupyter

Why Jupyter

Installing Python and Jupyter

Understanding Jupyter’s Interface – the Notebook Dashboard

Prerequisites for Coding in the Jupyter Notebooks

**Python – Variables and Data Types**

Variables

Variables

Numbers and Boolean Values in Python

Numbers and Boolean Values in Python

Python Strings

Python Strings

**Python – Basic Python Syntax**

Using Arithmetic Operators in Python

Indexing Elements

Indexing Elements

Structuring with Indentation

Structuring with Indentation

Using Arithmetic Operators in Python

The Double Equality Sign

The Double Equality Sign

How to Reassign Values

How to Reassign Values

Add Comments

Add Comments

Understanding Line Continuation

**Python – Other Python Operators**

Comparison Operators

Comparison Operators

Logical and Identity Operators

Logical and Identity Operators

**Python – Conditional Statements**

The IF Statement

The IF Statement

The ELSE Statement

The ELIF Statement

A Note on Boolean Values

A Note on Boolean Values

**Python – Python Functions**

Defining a Function in Python

How to Create a Function with a Parameter

Defining a Function in Python – Part II

How to Use a Function within a Function

Conditional Statements and Functions

Functions Containing a Few Arguments

Built-in Functions in Python

Python Functions

**Python – Sequences**

Lists

Lists

Using Methods

Using Methods

List Slicing

Tuples

Dictionaries

Dictionaries

**Python – Iterations**

For Loops

For Loops

While Loops and Incrementing

Lists with the range() Function

Lists with the range() Function

Conditional Statements and Loops

Conditional Statements, Functions, and Loops

How to Iterate over Dictionaries

**Python – Advanced Python Tools**

Object Oriented Programming

Object Oriented Programming

Modules and Packages

Modules and Packages

What is the Standard Library

What is the Standard Library

Importing Modules in Python

Importing Modules in Python

**Part 5 Advanced Statistical Methods in Python**

Introduction to Regression Analysis

Introduction to Regression Analysis

**Advanced Statistical Methods – Linear Regression with StatsModels**

The Linear Regression Model

Using Seaborn for Graphs

How to Interpret the Regression Table

How to Interpret the Regression Table

Decomposition of Variability

Decomposition of Variability

What is the OLS

What is the OLS

R-Squared

R-Squared

The Linear Regression Model

Correlation vs Regression

Correlation vs Regression

Geometrical Representation of the Linear Regression Model

Geometrical Representation of the Linear Regression Model

Python Packages Installation

First Regression in Python

First Regression in Python Exercise

**Advanced Statistical Methods – Multiple Linear Regression with StatsModels**

Multiple Linear Regression

A1 Linearity

A2 No Endogeneity

A2 No Endogeneity

A3 Normality and Homoscedasticity

A4 No Autocorrelation

A4 No autocorrelation

A5 No Multicollinearity

A5 No Multicollinearity

Dealing with Categorical Data – Dummy Variables

Dealing with Categorical Data – Dummy Variables

Multiple Linear Regression

Making Predictions with the Linear Regression

Adjusted R-Squared

Adjusted R-Squared

Multiple Linear Regression Exercise

Test for Significance of the Model (F-Test)

OLS Assumptions

OLS Assumptions

A1 Linearity

**Advanced Statistical Methods – Linear Regression with sklearn**

What is sklearn and How is it Different from Other Packages

Feature Selection (F-regression)

A Note on Calculation of P-values with sklearn

Creating a Summary Table with P-values

Multiple Linear Regression – Exercise

Feature Scaling (Standardization)

Feature Selection through Standardization of Weights

Predicting with the Standardized Coefficients

Feature Scaling (Standardization) – Exercise

Underfitting and Overfitting

Train – Test Split Explained

How are we Going to Approach this Section

Simple Linear Regression with sklearn

Simple Linear Regression with sklearn – A StatsModels-like Summary Table

A Note on Normalization

Simple Linear Regression with sklearn – Exercise

Multiple Linear Regression with sklearn

Calculating the Adjusted R-Squared in sklearn

Calculating the Adjusted R-Squared in sklearn – Exercise

**Advanced Statistical Methods – Practical Example Linear Regression**

Practical Example Linear Regression (Part 1)

Practical Example Linear Regression (Part 2)

A Note on Multicollinearity

Practical Example Linear Regression (Part 3)

Dummies and Variance Inflation Factor – Exercise

Practical Example Linear Regression (Part 4)

Dummy Variables – Exercise

Practical Example Linear Regression (Part 5)

Linear Regression – Exercise

**Advanced Statistical Methods – Logistic Regression**

Introduction to Logistic Regression

Binary Predictors in a Logistic Regression

Binary Predictors in a Logistic Regression – Exercise

Calculating the Accuracy of the Model

Calculating the Accuracy of the Model

Underfitting and Overfitting

Testing the Model

Testing the Model – Exercise

A Simple Example in Python

Logistic vs Logit Function

Building a Logistic Regression

Building a Logistic Regression – Exercise

An Invaluable Coding Tip

Understanding Logistic Regression Tables

Understanding Logistic Regression Tables – Exercise

What do the Odds Actually Mean

**Advanced Statistical Methods – Cluster Analysis**

Introduction to Cluster Analysis

Some Examples of Clusters

Difference between Classification and Clustering

Math Prerequisites

**Advanced Statistical Methods – K-Means Clustering**

K-Means Clustering

Relationship between Clustering and Regression

Market Segmentation with Cluster Analysis (Part 1)

Market Segmentation with Cluster Analysis (Part 2)

How is Clustering Useful

EXERCISE Species Segmentation with Cluster Analysis (Part 1)

EXERCISE Species Segmentation with Cluster Analysis (Part 2)

A Simple Example of Clustering

A Simple Example of Clustering – Exercise

Clustering Categorical Data

Clustering Categorical Data – Exercise

How to Choose the Number of Clusters

How to Choose the Number of Clusters – Exercise

Pros and Cons of K-Means Clustering

To Standardize or not to Standardize

**Advanced Statistical Methods – Other Types of Clustering**

Types of Clustering

Dendrogram

Heatmaps

**Part 6 Mathematics**

What is a Matrix

Addition and Subtraction of Matrices

Addition and Subtraction of Matrices

Errors when Adding Matrices

Transpose of a Matrix

Dot Product

Dot Product of Matrices

Why is Linear Algebra Useful

What is a Matrix

Scalars and Vectors

Scalars and Vectors

Linear Algebra and Geometry

Linear Algebra and Geometry

Arrays in Python – A Convenient Way To Represent Matrices

What is a Tensor

What is a Tensor

**Part 7 Deep Learning**

What to Expect from this Part

**Deep Learning – Introduction to Neural Networks**

Introduction to Neural Networks

The Linear Model with Multiple Inputs

The Linear model with Multiple Inputs and Multiple Outputs

The Linear model with Multiple Inputs and Multiple Outputs

Graphical Representation of Simple Neural Networks

Graphical Representation of Simple Neural Networks

What is the Objective Function

What is the Objective Function

Common Objective Functions L2-norm Loss

Common Objective Functions L2-norm Loss

Common Objective Functions Cross-Entropy Loss

Introduction to Neural Networks

Common Objective Functions Cross-Entropy Loss

Optimization Algorithm 1-Parameter Gradient Descent

Optimization Algorithm 1-Parameter Gradient Descent

Optimization Algorithm n-Parameter Gradient Descent

Optimization Algorithm n-Parameter Gradient Descent

Training the Model

Training the Model

Types of Machine Learning

Types of Machine Learning

The Linear Model (Linear Algebraic Version)

The Linear Model

The Linear Model with Multiple Inputs

**Deep Learning – How to Build a Neural Network from Scratch with NumPy**

Basic NN Example (Part 1)

Basic NN Example (Part 2)

Basic NN Example (Part 3)

Basic NN Example (Part 4)

Basic NN Example Exercises

**Deep Learning – TensorFlow 2.0 Introduction**

How to Install TensorFlow 2.0

TensorFlow Outline and Comparison with Other Libraries

TensorFlow 1 vs TensorFlow 2

A Note on TensorFlow 2 Syntax

Types of File Formats Supporting TensorFlow

Outlining the Model with TensorFlow 2

Interpreting the Result and Extracting the Weights and Bias

Customizing a TensorFlow 2 Model

Basic NN with TensorFlow Exercises

**Deep Learning – Digging Deeper into NNs Introducing Deep Neural Networks**

What is a Layer

What is a Deep Net

Digging into a Deep Net

Non-Linearities and their Purpose

Activation Functions

Activation Functions Softmax Activation

Backpropagation

Backpropagation Picture

Backpropagation – A Peek into the Mathematics of Optimization

**Deep Learning – Overfitting**

What is Overfitting

Underfitting and Overfitting for Classification

What is Validation

Training, Validation, and Test Datasets

N-Fold Cross Validation

Early Stopping or When to Stop Training

**Deep Learning – Initialization**

What is Initialization

Types of Simple Initializations

State-of-the-Art Method – (Xavier) Glorot Initialization

**Deep Learning – Digging into Gradient Descent and Learning Rate Schedules**

Stochastic Gradient Descent

Problems with Gradient Descent

Momentum

Learning Rate Schedules, or How to Choose the Optimal Learning Rate

Learning Rate Schedules Visualized

Adaptive Learning Rate Schedules (AdaGrad and RMSprop )

Adam (Adaptive Moment Estimation)

**Deep Learning – Preprocessing**

Preprocessing Introduction

Types of Basic Preprocessing

Standardization

Preprocessing Categorical Data

Binary and One-Hot Encoding

**Deep Learning – Classifying on the MNIST Dataset**

MNIST The Dataset

MNIST Learning

MNIST – Exercises

MNIST Testing the Model

MNIST How to Tackle the MNIST

MNIST Importing the Relevant Packages and Loading the Data

MNIST Preprocess the Data – Create a Validation Set and Scale It

MNIST Preprocess the Data – Scale the Test Data – Exercise

MNIST Preprocess the Data – Shuffle and Batch

MNIST Preprocess the Data – Shuffle and Batch – Exercise

MNIST Outline the Model

MNIST Select the Loss and the Optimizer

**Deep Learning – Business Case Example**

Business Case Exploring the Dataset and Identifying Predictors

Setting an Early Stopping Mechanism – Exercise

Business Case Testing the Model

Business Case Final Exercise

Business Case Outlining the Solution

Business Case Balancing the Dataset

Business Case Preprocessing the Data

Business Case Preprocessing the Data – Exercise

Business Case Load the Preprocessed Data

Business Case Load the Preprocessed Data – Exercise

Business Case Learning and Interpreting the Result

Business Case Setting an Early Stopping Mechanism

**Deep Learning – Conclusion**

Summary on What You’ve Learned

What’s Further out there in terms of Machine Learning

DeepMind and Deep Learning

An overview of CNNs

An Overview of RNNs

An Overview of non-NN Approaches

**Appendix Deep Learning – TensorFlow 1 Introduction**

READ ME!!!!

Basic NN Example with TF Exercises

How to Install TensorFlow 1

A Note on Installing Packages in Anaconda

TensorFlow Intro

Actual Introduction to TensorFlow

Types of File Formats, supporting Tensors

Basic NN Example with TF Inputs, Outputs, Targets, Weights, Biases

Basic NN Example with TF Loss Function and Gradient Descent

Basic NN Example with TF Model Output

**Appendix Deep Learning – TensorFlow 1 Classifying on the MNIST Dataset**

MNIST What is the MNIST Dataset

MNIST Solutions

MNIST Exercises

MNIST How to Tackle the MNIST

MNIST Relevant Packages

MNIST Model Outline

MNIST Loss and Optimization Algorithm

Calculating the Accuracy of the Model

MNIST Batching and Early Stopping

MNIST Learning

MNIST Results and Testing

**Appendix Deep Learning – TensorFlow 1 Business Case**

Business Case Getting Acquainted with the Dataset

Business Case Testing the Model

Business Case A Comment on the Homework

Business Case Final Exercise

Business Case Outlining the Solution

The Importance of Working with a Balanced Dataset

Business Case Preprocessing

Business Case Preprocessing Exercise

Creating a Data Provider

Business Case Model Outline

Business Case Optimization

Business Case Interpretation

**Software Integration**

What are Data, Servers, Clients, Requests, and Responses

Software Integration – Explained

What are Data, Servers, Clients, Requests, and Responses

What are Data Connectivity, APIs, and Endpoints

What are Data Connectivity, APIs, and Endpoints

Taking a Closer Look at APIs

Taking a Closer Look at APIs

Communication between Software Products through Text Files

Communication between Software Products through Text Files

Software Integration – Explained

**Case Study – What’s Next in the Course**

Game Plan for this Python, SQL, and Tableau Business Exercise

The Business Task

Introducing the Data Set

Introducing the Data Set

**Case Study – Preprocessing the ‘Absenteeism data’**

What to Expect from the Following Sections

Analyzing the Reasons for Absence

Obtaining Dummies from a Single Feature

EXERCISE – Obtaining Dummies from a Single Feature

SOLUTION – Obtaining Dummies from a Single Feature

Dropping a Dummy Variable from the Data Set

More on Dummy Variables A Statistical Perspective

Classifying the Various Reasons for Absence

Using .concat() in Python

EXERCISE – Using .concat() in Python

SOLUTION – Using .concat() in Python

Importing the Absenteeism Data in Python

Reordering Columns in a Pandas DataFrame in Python

EXERCISE – Reordering Columns in a Pandas DataFrame in Python

SOLUTION – Reordering Columns in a Pandas DataFrame in Python

Creating Checkpoints while Coding in Jupyter

EXERCISE – Creating Checkpoints while Coding in Jupyter

SOLUTION – Creating Checkpoints while Coding in Jupyter

Analyzing the Dates from the Initial Data Set

Extracting the Month Value from the Date Column

Extracting the Day of the Week from the Date Column

EXERCISE – Removing the Date Column

Checking the Content of the Data Set

Analyzing Several Straightforward Columns for this Exercise

Working on Education, Children, and Pets

Final Remarks of this Section

A Note on Exporting Your Data as a .csv File

Introduction to Terms with Multiple Meanings

What’s Regression Analysis – a Quick Refresher

Using a Statistical Approach towards the Solution to the Exercise

Dropping a Column from a DataFrame in Python

EXERCISE – Dropping a Column from a DataFrame in Python

SOLUTION – Dropping a Column from a DataFrame in Python

**Case Study – Applying Machine Learning to Create the ‘absenteeism module’**

Exploring the Problem with a Machine Learning Mindset

Interpreting the Coefficients of the Logistic Regression

Backward Elimination or How to Simplify Your Model

Testing the Model We Created

Saving the Model and Preparing it for Deployment

ARTICLE – A Note on ‘pickling’

EXERCISE – Saving the Model (and Scaler)

Preparing the Deployment of the Model through a Module

Creating the Targets for the Logistic Regression

Selecting the Inputs for the Logistic Regression

Standardizing the Data

Splitting the Data for Training and Testing

Fitting the Model and Assessing its Accuracy

Creating a Summary Table with the Coefficients and Intercept

Interpreting the Coefficients for Our Problem

Standardizing only the Numerical Variables (Creating a Custom Scaler)

**Case Study – Loading the ‘absenteeism module’**

Are You Sure You’re All Set

Deploying the ‘absenteeism module’ – Part I

Deploying the ‘absenteeism module’ – Part II

Exporting the Obtained Data Set as a .csv

**Case Study – Analyzing the Predicted Outputs in Tableau**

EXERCISE – Age vs Probability

Analyzing Age vs Probability in Tableau

EXERCISE – Reasons vs Probability

Analyzing Reasons vs Probability in Tableau

EXERCISE – Transportation Expense vs Probability

Analyzing Transportation Expense vs Probability in Tableau

**Bonus Lecture**

Bonus Lecture Next Steps

Resolve the captcha to access the links!