Data Engineering on Azure

Data Engineering on Azure

English | 2021 | ISBN: 978-1617298929 | 336 Pages | EPUB | 10 MB

Build a data platform to the industry-leading standards set by Microsoft’s own infrastructure.

In Data Engineering on Azure you will learn how to:

  • Pick the right Azure services for different data scenarios
  • Manage data inventory
  • Implement production quality data modeling, analytics, and machine learning workloads
  • Handle data governance
  • Using DevOps to increase reliability
  • Ingesting, storing, and distributing data
  • Apply best practices for compliance and access control

Data Engineering on Azure reveals the data management patterns and techniques that support Microsoft’s own massive data infrastructure. Author Vlad Riscutia, a data engineer at Microsoft, teaches you to bring an engineering rigor to your data platform and ensure that your data prototypes function just as well under the pressures of production. You’ll implement common data modeling patterns, stand up cloud-native data platforms on Azure, and get to grips with DevOps for both analytics and machine learning.

Build secure, stable data platforms that can scale to loads of any size. When a project moves from the lab into production, you need confidence that it can stand up to real-world challenges. This book teaches you to design and implement cloud-based data infrastructure that you can easily monitor, scale, and modify.

In Data Engineering on Azure you’ll learn the skills you need to build and maintain big data platforms in massive enterprises. This invaluable guide includes clear, practical guidance for setting up infrastructure, orchestration, workloads, and governance. As you go, you’ll set up efficient machine learning pipelines, and then master time-saving automation and DevOps solutions. The Azure-based examples are easy to reproduce on other cloud platforms.

What’s inside

  • Data inventory and data governance
  • Assure data quality, compliance, and distribution
  • Build automated pipelines to increase reliability
  • Ingest, store, and distribute data
  • Production-quality data modeling, analytics, and machine learning
Homepage