Optimizing Dataflow Performance

Acquisizione dei dati e modelli semantici con Microsoft Fabric

Alex Kuntz

Head of Cloud Curriculum, DataCamp

Staging in Dataflows Gen2

Temporarily holds data during transformation to optimize performance

  • Staging Artifacts:
    • Hidden internal Lakehouse storage used during data transformations
    • Automatically managed by Dataflows; not for direct user access
  • When to Use Staging:
    • Enabled by default for improved SQL endpoint performance
    • Disabled for direct Lakehouse and non-warehouse loading (can be re-enabled)
  • Removing Staging Data:
    • Disable staging and refresh (cleared after 30 days)
    • Delete the dataflow or workspace to remove immediately
Acquisizione dei dati e modelli semantici con Microsoft Fabric

Accelerating Data Ingestion with Fast Copy

A high-speed data ingestion feature that scales to handle large datasets efficiently

  • Architecture: Redistributes heavy workloads from Power Query to a high-performance pipeline for faster processing
  • Benefit: Minimizes processing time by leveraging scalable backend resources for large data

DataFlows Gen2 Fast Copy Architecture

Acquisizione dei dati e modelli semantici con Microsoft Fabric

Optimizing Fast Copy: Prerequisites and Key Settings

Prerequisites:

  • Files: 100 MB+ (CSV/Parquet)
  • Databases: 5M+ rows (Azure SQL DB, PostgreSQL)
  • Supported Connectors: ADLS Gen2, Blob Storage, SQL DB, Lakehouse, PostgreSQL, On premise SQL Server, Warehouse, Oracle
  • Supported Transformations: Combine files, Select columns, Change data types, Rename/Remove columns

Require Fast Copy Option:

  • Forces the use of Fast Copy, fails immediately if criteria is not met
  • Saves time by avoiding long wait times with slower processing
Acquisizione dei dati e modelli semantici con Microsoft Fabric

Dataflow Gen2 Default Destination

  • Create stand-alone Dataflows for specific data destinations (Lakehouse, Warehouse, or KQL Database).

  • Preset data destination settings are applied automatically, speeding up development!

Preset Behaviors: Following are default behaviors and cannot be changed

  • Lakehouse: Replace update method, Dynamic schema
  • Warehouse/KQL Database: Append update method, Fixed schema

Default Data Destination in Dataflows Gen2

Acquisizione dei dati e modelli semantici con Microsoft Fabric

Let's practice!

Acquisizione dei dati e modelli semantici con Microsoft Fabric

Preparing Video For Download...