Previsione e valutazione

Introduzione a Spark SQL in Python

Mark Plutowski

Data Scientist

Applicare un modello ai dati di valutazione

predicted = df_trained.transform(df_test)

x = predicted.first
print("Right!" if x.label == int(x.prediction) else "Wrong")

model_stats = model.evaluate(df_eval)

type(model_stats)

pyspark.ml.classification.BinaryLogisticRegressionSummary)

print("\nPerformance: %.2f" % model_stats.areaUnderROC)

Label positive:
- ['her', 'him', 'he', 'she', 'them', 'us', 'they', 'himself', 'herself', 'we']
Numero di esempi: 5746
Numero di esempi: 2873 positivi, 2873 negativi
Esempi di training: 4607
Esempi di test: 1139
Iterazioni di training: 21
AUC test: 0.87

Introduzione a Spark SQL in Python