Learning Path

Professional Data Engineer Exam Guide

Master every domain of the GCP Professional Data Engineer certification. From designing data processing systems to building pipelines, choosing storage, preparing data for analytics and ML, and automating production workloads. Five comprehensive study guides with hands-on notebooks.

01

Designing Data Processing Systems

Covers roughly 22% of the exam. Design for security and compliance, reliability and fidelity, flexibility and portability, and plan data migrations. Master IAM, encryption, Dataflow, Dataform, Cloud Data Fusion, and migration services.

11 min readNotebook
Security & ComplianceReliability & FidelityFlexibility & PortabilityData Migrations
02

Ingesting and Processing Data

Covers roughly 25% of the exam — the largest section. Plan and build data pipelines, understand batch vs streaming processing, windowing, late data handling, and deploy production pipelines with Cloud Composer and CI/CD.

10 min readNotebook
Pipeline PlanningDataflow & BeamBatch vs StreamingWindowing & Late DataCloud Composer
03

Storing the Data

Covers roughly 20% of the exam. Select the right storage system for each workload, design data warehouses and data lakes, understand BigQuery optimization, and build a unified data platform.

10 min readNotebook
Storage SelectionData Warehouse DesignData Lake ArchitectureBigQuery Optimization
04

Preparing Data for Analysis and ML

Covers roughly 15% of the exam. Data visualization with BI Engine, preparing data for AI/ML with BigQuery ML, embeddings and RAG patterns, data sharing with Analytics Hub, and protecting sensitive data with Cloud DLP and data masking.

8 min readNotebook
BI Engine & VisualizationBigQuery MLAnalytics HubDLP & Data Masking
05

Maintaining and Automating Workloads

Covers roughly 18% of the exam. Optimize resources and costs, automate with Cloud Composer DAGs, manage BigQuery workloads with Editions and reservations, monitor and troubleshoot pipelines, and design for fault tolerance and data integrity.

10 min readNotebook
Resource OptimizationAutomation & DAGsMonitoringFault Tolerance