Data Science on Google Cloud Platform: Exploratory Data Analytics

Data Science on Google Cloud Platform: Exploratory Data Analytics

English | MP4 | AVC 1280×720 | AAC 48KHz 2ch | 0h 57m | 134 MB

Cloud computing brings unlimited scalability and elasticity to data science applications. Expertise in the major platforms, such as Google Cloud Platform (GCP), is essential to the IT professional. This course—one of a series by cloud engineering specialist and data scientist Kumaran Ponnambalam—shows how to conduct exploratory data analytics with GCP. First, review the concepts of segmentation and profiling. Then get hands on, as you learn to perform both text and visual analysis of data using tools provided by GCP: Cloud Datalab, BigQuery, Cloud Dataflow, and Data Studio. Finally, look at an end-to-end use case that applies what you’ve learned in the course.

Topics include:

  • Setting up Cloud DataLlb for exploratory data analytics
  • Segmentation and profiling
  • Reading and writing data from BigQuery
  • Managing cloud storage buckets
  • Creating visualizations of BigQuery data with the GCP Charting API
  • Managing Datalab instances
Table of Contents

Introduction
1 Why EDA on Datalab
2 Data science modules covered

Exploration Options in GCP
3 BigQuery
4 Datalab
5 Data Studio
6 Cloud Dataflow

Cloud Datalab Basics
7 What is Datalab
8 Setting up the Cloud SDK
9 Setting up Datalab
10 Managing Datalab
11 Using the exercise files
12 Other capabilities

Datalab – BigQuery
13 Setting up BigQuery
14 BigQuery commands
15 Reading data from BigQuery
16 Working with DataFrames
17 Writing data to BigQuery

Datalab – Cloud Storage
18 Listing bucket contents
19 Managing buckets
20 Reading objects from a bucket
21 Writing to buckets

Datalab – Visualizations
22 Introduction to the charting API
23 Line charts with BigQuery data
24 Pie charts with BigQuery data
25 Time series analysis with Cloud Storage

EDA with GCP – Use Case
26 Loading data into a DataFrame
27 Cleansing and transforming data
28 Statistics and correlations
29 Segmentation and profiling
30 Writing results to Cloud Storage

Managing Datalab
31 Datalab instance management
32 Adding new packages
33 Managing source code
34 Datalab best practices

Conclusion
35 Next steps