Pengantar Spark SQL dalam Python
Mark Plutowski
Data Scientist
df.cache()
df.unpersist()
df.is_cached
False
df.cache()
df.is_cached
True
df.unpersist()
df.is_cached()
False
df.unpersist()
df.cache()
df.storageLevel
StorageLevel(True, True, False, True, 1)
Pada storage level di atas berlaku hal berikut:
useDisk = TrueuseMemory = TrueuseOffHeap = Falsedeserialized = Truereplication = 1Berikut setara di Spark 2.1+:
df.persist()
df.persist(storageLevel=pyspark.StorageLevel.MEMORY_AND_DISK)
df.cache() sama dengan df.persist()
df.createOrReplaceTempView('df')
spark.catalog.isCached(tableName='df')
False
spark.catalog.cacheTable('df')
spark.catalog.isCached(tableName='df')
True
spark.catalog.uncacheTable('df')
spark.catalog.isCached(tableName='df')
False
spark.catalog.clearCache()
Pengantar Spark SQL dalam Python