training
Implementing a Data Analytics Solution with Azure Databricks
Start met data engineering op Azure Databricks
Beschrijving
Leer de kracht van Apache Spark en krachtige clusters op het Azure Databricks platform in te zetten om grote data engineering workloads te draaien in de cloud.
Benodigde voorkennis
Kennis van data engineering en Microsoft Azure
Onderwerpen
Module 1: Explore Azure Databricks
Azure Databricks is a cloud service that provides a scalable platform for data analytics using Apache Spark.
After completing this module, you will be able to:
- Provision an Azure Databricks workspace.
- Identify core workloads and personas for Azure Databricks.
- Describe key concepts of an Azure Databricks solution.
Module 2: Use Apache Spark in Azure Databricks
Azure Databricks is built on Apache Spark and enables data engineers and analysts to run Spark jobs to transform, analyze and visualize data at scale.
After completing this module, you will be able to:
- Describe key elements of the Apache Spark architecture.
- Create and configure a Spark cluster.
- Describe use cases for Spark.
- Use Spark to process and analyze data stored in files.
- Use Spark to visualize data.
Module 3: Use Delta Lake in Azure Databricks
Delta Lake is an open source relational storage area for Spark that you can use to implement a data lakehouse architecture in Azure Databricks.
After completing this module, you will be able to:
- Describe core features and capabilities of Delta Lake.
- Create and use Delta Lake tables in Azure Databricks.
- Create Spark catalog tables for Delta Lake data.
- Use Delta Lake tables for streaming data.
Module 4: Use SQL Warehouses in Azure Databricks
Azure Databricks provides SQL Warehouses that enable data analysts to work with data using familiar relational SQL queries.
After completing this module, you will be able to:
- Create and configure SQL Warehouses in Azure Databricks.
- Create databases and tables.
- Create queries and dashboards.
Module 5: Run Azure Databricks Notebooks with Azure Data Factory
Using pipelines in Azure Data Factory to run notebooks in Azure Databricks enables you to automate data engineering processes at cloud scale.
After completing this module, you will be able to:
- Describe how Azure Databricks notebooks can be run in a pipeline.
- Create an Azure Data Factory linked service for Azure Databricks.
- Use a Notebook activity in a pipeline.
- Pass parameters to a notebook.
Planning
Incompany of persoonlijk advies nodig?
Onze opleidingsadviseurs denken graag met je mee om een persoonlijk advies te geven of een incompany training binnen jouw organisatie te vinden.
Voorkennis trainingen
Microsoft Azure Data Fundamentals (DP-900)
Leer de basis van cloud data solutions binnen Microsoft Azure
- Cloud
Introduction to Data Analytics and Business Intelligence
Leer de principes van Data Analytics en Business Intelligence in één dag
- Business Intelligence
Essentials of Python Development
Vorm een solide basis om software te ontwikkelen in Python
- Python
"Zeer prettige docent, gaf op een eigen manier een zeer goede invulling aan de cursus. Was prettig om zo de cursus te volgen."Marieke
-
Hoge waardering
-
Praktijkgerichte trainingen
-
Gecertificeerde trainers
-
Eigen docenten