Implement performance improvements

Transform and Analyze Data with Microsoft Fabric

Luis Silva

Solution Architect - Data & AI

Optimization strategies

  • Optimize design of Fabric items
  • Scale up
  • Scale out
Transform and Analyze Data with Microsoft Fabric

Optimizing SQL

  • Limit the queries to only the columns and operations needed

Example SQL SELECT statement reduced from selecting all columns to selecting only particular columns

Transform and Analyze Data with Microsoft Fabric

Optimizing SQL

  • Limit the queries to only the columns and operations needed
  • Use the smallest data types possible

Example CREATE TABLE statement where the data type of a numeric column is changed from BIGINT to INT

Transform and Analyze Data with Microsoft Fabric

Optimizing SQL

  • Limit the queries to only the columns and operations needed
  • Use the smallest data types possible
  • Use the star schema design

Diagram of a star schema with a fact table and related dimension tables

Transform and Analyze Data with Microsoft Fabric

Optimizing SQL

  • Limit the queries to only the columns and operations needed
  • Use the smallest data types possible
  • Use the star schema design
  • Use system views to monitor query usage
SELECT *
FROM sys.dm_exec_requests;
Transform and Analyze Data with Microsoft Fabric

Optimizing Notebooks

  • Leverage the Spark History Server

Screenshot of the Spark UI web page

Transform and Analyze Data with Microsoft Fabric

Optimizing Notebooks

  • Leverage the Spark History Server
  • Stop active Spark sessions when not in use
Transform and Analyze Data with Microsoft Fabric

Optimizing Notebooks

  • Leverage the Spark History Server
  • Stop active Spark sessions when not in use
  • When joining data, reduce the amount of data in memory prior to executing the join
Transform and Analyze Data with Microsoft Fabric

Optimizing dataflows

  • Minimize expensive operations like sorting

Sample dataflow where a sort operation is eliminated if it's not required

Transform and Analyze Data with Microsoft Fabric

Optimizing dataflows

  • Minimize expensive operations like sorting
  • Query folding
    • Offload processing tasks to the data source

Sample dataflow step. When no query folding occurs, the step evaluation occurs outside of the data source. When query folding occurs, the operation is evaluated by the data source

Transform and Analyze Data with Microsoft Fabric

Optimizing dataflows

  • Minimize expensive operations like sorting
  • Query folding
    • Offload processing tasks to the data source
  • Staging On/Off
    • Disable for small data loads or simpler transformations
    • Enable for larger data loads or more complex transformations

Screenshot showing the Enable Staging option in a dataflow query

Transform and Analyze Data with Microsoft Fabric

Scale Up

  • Increase the SKU size
  • Higher SKU = more compute capacity

Table listing the various Fabric Azure SKUs available, from the smallest F2 to the largest F2048 v

Transform and Analyze Data with Microsoft Fabric

Scale Up

Example illustration of scaling up a Fabric capacity by increasing the number of v-cores

Example: Scale up from F64 (8 v-cores) to F128 (16 v-cores)

Transform and Analyze Data with Microsoft Fabric

Scale Out

  • Move some items to a different Fabric capacity.
  • Isolate workloads
    • Production vs non-production
    • Different areas of the business, for example a separate capacity for executive reporting
Transform and Analyze Data with Microsoft Fabric

Scale Out

Example illustration of scaling out a Fabric capacity by adding a second capacity and moving a workload to it

Transform and Analyze Data with Microsoft Fabric

Let's practice!

Transform and Analyze Data with Microsoft Fabric

Preparing Video For Download...