Stream Processing Design Patterns with Spark

Stream Processing Design Patterns with Spark

English | MP4 | AVC 1280×720 | AAC 48KHz 2ch | 1h 09m | 219 MB

Stream processing is becoming more popular as more and more data is generated by websites, devices, and communications. Apache Spark is a leading platform that provides scalable and fast stream processing, but still requires smart design to achieve maximum efficiency. This course helps developers use best practices and validated design patterns to implement stream processing in Apache Spark. Instructor Kumaran Ponnambalam shows how to set up your environment and then walks through four design patterns and real-world use cases: streaming analytics, alerts and thresholds, leaderboards, and real-time predictions. In chapter six, he introduces a start-to-finish project that shows how to go from design to executed job using Spark, Apache Kafka, MariaDB, and Redis. By the end of the course, you’ll understand all the capabilities of this powerful platform and be able to incorporate it in your own data engineering solutions.

Topics include:

  • Streaming opportunities and challenges
  • Setting up the environment
  • Steaming analytics with Spark
  • Monitoring alerts and thresholds with Spark
  • Creating leaderboards with Spark
  • Generating real-time predictions with Spark
  • Hands-on Spark streaming project
Table of Contents

Introduction
1 Streaming with Spark
2 Prerequisites

Stream Processing with Spark
3 What is stream processing
4 Streaming opportunities and challenges
5 Streaming with Apache Spark
6 Spark Structured Streaming APIs and SQL
7 Setting up the exercise files
8 Setting up Kafka
9 Setting up MariaDB and Redis

Streaming Analytics
10 Streaming analytics Pattern
11 Streaming analytics Use case design
12 Streaming analytics Helper classes
13 Streaming analytics Pipeline implementation
14 Streaming analytics Results review

Alerts and Thresholds
15 Alerts and thresholds Pattern
16 Alerts and thresholds Use case design
17 Alerts and thresholds Helper classes
18 Alerts and thresholds Pipeline implementation
19 Alerts and thresholds Review

Leaderboards
20 Leaderboards Pattern
21 Leaderboards Use case design
22 Leaderboards Helper classes
23 Leaderboards Pipeline implementation
24 Leaderboards Review

Real-Time Predictions
25 Real-time predictions Pattern
26 Real-time predictions Use case design
27 Real-time predictions Helper classes
28 Real-time predictions Pipeline implementation
29 Real-time predictions Review

Use Cases
30 Use case definition
31 Design of the project
32 Code walk-through
33 Execute and analyze

Conclusion
34 Next steps