Gegevens transformeren en analyseren met Microsoft Fabric
Luis Silva
Solution Architect - Data & AI


SUM()COUNT()AVG()MIN()MAX()GROUP BYSTDEV()VAR()SELECT
<unaggregated columns>,
function(<aggregated column>)
FROM
<table>
GROUP BY
<unaggregated columns>;
SELECT
[State],
COUNT([Order_ID]) AS [Num Orders],
SUM([Order_Amount]) AS [Total Amount]
FROM
[tbl_Orders]
GROUP BY
[State]

sum()count()avg()min() en max()first() en last()stdev()variance()groupBy() en agg()df.groupBy(<unaggregated columns>)
.agg(function(<aggregated column>))

from pyspark.sql.functions import sum
df.groupBy("state").agg(count("order_id"), sum("order_amount")).show()
pyspark.sql.functions met een statement aan het begin van je code.#----- Importeer één of meerdere functies:
from pyspark.sql.functions import sum, avg, count, min, max
#----- Importeer alle SQL-functies:
from pyspark.sql.functions import *
#----- Importeer alle SQL-functies met een alias:
import pyspark.sql.functions as F
# call sum: F.sum()
SomGemiddeldeMediaanMin MaxPercentielRijen tellen


Gegevens transformeren en analyseren met Microsoft Fabric