Architecting Data Warehousing Solutions Using Google BigQuery

Architecting Data Warehousing Solutions Using Google BigQuery

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 2h 48m | 378 MB

BigQuery is the Google Cloud Platform’s data warehouse on the cloud. In this course, you’ll learn how you can work with BigQuery on huge datasets with little to no administrative overhead.

Organizations store massive amounts of data that gets collated from a wide variety of sources. BigQuery supports fast querying at a petabyte scale, with serverless functionality and autoscaling. BigQuery also supports streaming data, works with visualization tools, and interacts seamlessly with Python scripts running from Datalab notebooks. In this course, Architecting Data Warehousing Solutions Using Google BigQuery, you’ll learn how you can work with BigQuery on huge datasets with little to no administrative overhead related to cluster and node provisioning. First, you’ll start off with an overview of the suite of storage products on the Google Cloud and the unique position that BigQuery holds. You’ll see how BigQuery compares with Cloud SQL, BigTable, and Datastore on the GCP and how it differs from Amazon Redshift, the data warehouse on AWS. Next, you’ll create datasets in BigQuery which are the equivalent of databases in RDMBSes and create tables within datasets where actual data is stored. You’ll work with BigQuery using the web console as well as the command line. You’ll load data into BigQuery tables using the CSV, JSON, and AVRO format and see how you can execute and manage jobs. Finally, you’ll wrap up by exploring advanced analytical queries which use nested and repeated fields. You’ll run aggregate operations on your data and use advanced windowing functions as well. You’ll programmatically access BigQuery using client libraries in Python and visualize your data using Data Studio. At the end of this course, you’ll be comfortable working with huge datasets stored in BigQuery, executing analytical queries, performing analysis, and building charts and graphs for your reports.

Table of Contents

Course Overview
1 Course Overview

Understanding BigQuery in the GCP Service Taxonomy
2 Module Overview
3 Prerequisites and Course Outline
4 Transactional and Analytical Processing
5 Introducing BigQuery
6 Choosing BigQuery
7 BigQuery Pricing
8 Logging into the GCP and Enabling APIs
9 Beta UI and Classic UI
10 Public Datasets
11 Executing Queries
12 Working with the bq Command on the Terminal

Using Datasets Tables and Views in BigQuery
13 Module Overview
14 Datasets Tables and Views
15 Creating and Editing Access to a Dataset
16 Verifying Access to Datasets
17 Creating and Querying Tables
18 Creating Tables from Other Source Tables
19 Creating Views
20 Authorized Views

Getting Data in and out of BigQuery
21 Module Overview
22 Uploading Data to GCS Buckets
23 Importing Data from CSV Files on Cloud Storage
24 Configuring Additional Settings While Importing JSON Files
25 Creating External Tables with Data in GCS Buckets
26 Partitioning in BigQuery
27 Creating and Querying Ingestion Time Partitioned Tables
28 Creating Column Based Partitioned Tables Using the Command Line

Performing Advanced Analytical Queries in BigQuery
29 Module Overview
30 Normalized Storage in a Traditional Database
31 Denormalized Storage Nested and Repeated Fields
32 The UNNEST ARRAY AGG and the STRUCT Operators
33 Working with Nested Fields
34 Populating Data into a Table with Nested Fields Using STRUCT
35 Working with Repeated Fields
36 Populating Tables with Repeated Fields Using ARRAY AGG
37 Using Nested and Repeated Fields Together
38 Using UNNEST to Query Repeated Fields
39 Aggregations
40 Subqueries
41 Windowing Operations
42 Performing Window Operations Using Partition By and Order By
43 Windowing Operations Using a Window Range

Programmatically Accessing BigQuery from Client Programs
44 Module Overview
45 Integrating BigQuery with Data Studio
46 Connecting to Datalab
47 Running Queries Programmatically
48 Summary and Further Study