Merging and Joining Data

Transform and Analyze Data with Microsoft Fabric

Luis Silva

Solution Architect - Data & AI

Tools for data merging

 

 

Icons representing three tools: SQL, Spark and Dataflows

Transform and Analyze Data with Microsoft Fabric

Basics of data merging

  • Join tables based on common columns.
  • Useful when building star schemas.
  • Core operation: JOIN between two tables (left and right).
Transform and Analyze Data with Microsoft Fabric

Basics of data merging

  • Join tables based on common columns.
  • Useful when building star schemas.
  • Core operation: JOIN between two tables (left and right).

Diagram illustrating the merge of two tables

Transform and Analyze Data with Microsoft Fabric

Types of Join

  • INNER JOIN: Returns records that have matching values in both tables.

Diagram representing two tables, left and right, highlighting the intersection as the result of the inner join operation

Transform and Analyze Data with Microsoft Fabric

Types of Join

  • INNER JOIN: Returns records that have matching values in both tables.
  • LEFT OUTER JOIN: Returns all records from the left table, and the matched records from the right table.

Diagram representing two tables, left and right, highlighting the left table and the intersection as the result of the left outer join operation

Transform and Analyze Data with Microsoft Fabric

Types of Join

  • INNER JOIN: Returns records that have matching values in both tables.
  • LEFT OUTER JOIN: Returns all records from the left table, and the matched records from the right table.
  • RIGHT OUTER JOIN: Returns all records from the right table, and the matched records from the left table.

Diagram representing two tables, left and right, highlighting the right table and the intersection as the result of the right outer join operation

Transform and Analyze Data with Microsoft Fabric

Types of Join

  • INNER JOIN: Returns records that have matching values in both tables.
  • LEFT OUTER JOIN: Returns all records from the left table, and the matched records from the right table.
  • RIGHT OUTER JOIN: Returns all records from the right table, and the matched records from the left table.
  • FULL OUTER JOIN: Returns all records when there is a match in either left or right table.

Diagram representing two tables, left and right, highlighting the union as the result of the full outer join operation

Transform and Analyze Data with Microsoft Fabric

Merging tables with SQL

  • JOIN statement
  • Syntax:
    SELECT 
      <columns> 
      ... 
    FROM <Left_Table>
    INNER | LEFT | RIGHT | FULL JOIN <Right_Table> 
      ON <Left_Table>.<Join_Column> = <Right_Table>.<Join_Column>;
    
Transform and Analyze Data with Microsoft Fabric

Merging tables with PySpark

  • join() function
  • Syntax:
    <df_left>.join(
      <df_right,
      [<df_left>.<join_column> == <df_right>.<join_column>],
      how='inner' | 'left' | 'right | 'outer'
    )
    
Transform and Analyze Data with Microsoft Fabric

Merging tables with Dataflows

  • Merge queries
  • Merge queries as new

Screenshot showing the Merqe queries dialog in a Dataflow

Transform and Analyze Data with Microsoft Fabric

Let's practice!

Transform and Analyze Data with Microsoft Fabric

Preparing Video For Download...