course
Data Engineering on Azure (DP-203)
Gain in-depth and hands-on expertise with the Azure Data Engineering toolbox
Description
In this course, the student will learn about the data engineering patterns and practices as it pertains to working with batch and real-time analytical solutions using Azure data platform technologies. Students will begin by understanding the core compute and storage technologies that are used to build an analytical solution. They will then explore how to design an analytical serving layers and focus on data engineering considerations for working with source files.
The students will learn how to interactively explore data stored in files in a data lake through Serverless SQL pools or Apache Spark notebooks. They will learn the various ingestion techniques that can be used to load data using the Apache Spark capability found in Azure Synapse Analytics or Azure Databricks, or how to ingest using Azure Data Factory or Azure Synapse pipelines. During this process you will use Polybase, COPY and other tools to keep the process performant with big and small data. The students will also learn the various ways they can transform the data using the same technologies that is used to ingest data.
They will understand the importance of implementing security to ensure that the data is protected at rest or in transit. The student will then show how to create a real-time analytical system to create real-time analytical solutions.
This course also prepares for exam DP-203: Data Engineering on Microsoft Azure and earn certification Microsoft Certified: Azure Data Engineer Associate. Exam voucher is not included.
Prior Knowledge
Experience with Data Engineering. Azure Data Fundamentals
Subjects
1: Get started with data engineering on Azure
In most organizations, a data engineer is the primary role responsible for integrating, transforming, and consolidating data from various structured and unstructured data systems into structures that are suitable for building analytics solutions. An Azure data engineer also helps ensure that data pipelines and data stores are high-performing, efficient, organized, and reliable, given a specific set of business requirements and constraints.
Lessons
- Introduction to data engineering on Azure
- Introduction to Azure Data Lake Storage Gen2
- Introduction to Azure Synapse Analytics
2: Analyze data with Azure Synapse Analytics serverless SQL pools
If you have large volumes of data stored as files in a data lake, you'll need a convenient way to explore and analyze the data they contain. Azure Synapse Analytics enables you to apply the SQL skills you use in a relational database to files in a data lake.
Lessons
- Use Azure Synapse serverless SQL pool to query files in a data lake
- Use Azure Synapse serverless SQL pool to transform data in a data lake
- Create a lake database in Azure Synapse Analytics
3: Perform data engineering with Azure Synapse Apache Spark Pools
Apache Spark is a highly scalable distributed processing solution for big data analytics and transformation. You can leverage its power in Azure Synapse Analytics by using Spark pools.
Lessons
- Analyze data with Apache Spark in Azure Synapse Analytics
- Transform data with Spark in Azure Synapse Analytics
- Use Delta Lake in Azure Synapse Analytics
4: Work with data warehouses using Azure Synapse Analytics
Relational data warehouses are at the heart of many business intelligence and enterprise analytics solutions. You can use Azure Synapse Analytics to implement highly scalable data warehouses in the cloud.
Lessons
- Analyze data in a relational data warehouse
- Load data into a relational data warehouse
5: Transfer and transform data with Azure Synapse Analytics pipelines
Azure Synapse Analytics enables data integration through the use of pipelines, which you can use to automate and orchestrate data transfer and transformation activities.
Lessons
- Build a data pipeline in Azure Synapse Analytics
- use Spark Notebooks in an Azure Synapse Pipeline
6: Work with hybrid transactional and analytical processing (HTAP) solutions using Azure Synapse Analytics
Hybrid Transactional and Analytical Processing (HTAP) is a technique for near real time analytics without a complex ETL solution. In Azure Synapse Analytics, HTAP is supported through Azure Synapse Link.
Lessons
- Plan hybrid transactional and analytical processing using Azure Synapse Analytics
- Implement Azure Synapse Link with Azure Cosmos DB
- Implement Azure Synapse Link for SQL
7: Implement a data streaming solution with Azure Stream Analytics
Stream processing enables you to capture and analyze data in real-time. Azure Stream Analytics is a cloud-based stream processing engine that you can use to build highly scalable real-time analytics solutions.
Lessons
- Get started with Azure Stream Analytics
- Ingest streaming data using Azure Stream Analytics and Azure Synapse Analytics
- Visualize real-time data with Azure Stream Analytics and Power BI
8: Govern data across an enterprise
Use Microsoft Purview to register and scan data, catalog data artifacts, find data for reporting, and manage Power BI artifacts to improve data governance in your organization.
Lessons
- Introduction to Microsoft Purview
- Integrate Microsoft Purview and Azure Synapse Analytics
9: Data engineering with Azure Databricks
Learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.
Lessons
- Explore Azure Databricks
- Use Apache Spark in Azure Databricks
- Use Delta Lake in Azure Databricks
- Use SQL Warehouses in Azure Databricks
- Run Azure Databricks Notebooks with Azure Data Factory
Schedule
Start date | Duration | Location | |
---|---|---|---|
February 24, 2025February 25, 2025February 26, 2025February 27, 2025 | 4 days | Utrecht / Remote This is a hybrid training and can be followed remotely. More information Utrecht / Remote This is a hybrid training and can be followed remotely. More information Utrecht / Remote This is a hybrid training and can be followed remotely. More information Utrecht / Remote This is a hybrid training and can be followed remotely. More information | Sign up |
All courses can also be conducted within your organization as customized or incompany training.
Our training advisors are happy to help you provide personal advice or find Incompany training within your organization.
Prior knowledge courses
Follow-up courses
Certification
The knowledge from this training aligns with these certifications.
"Trainer who knows his profession!"Marc
-
Hoge waardering
-
Praktijkgerichte trainingen
-
Gecertificeerde trainers
-
Eigen docenten