Metadata
- Authors: Tandon Akash;Ryza Sandy;Laserson Uri;Owen Sean;Wills Josh;
- Full Title:: Advanced Analytics With Pyspark
- Category:: 🗞️Articles
- Document Tags:: Spark
- URL:: https://ipfs.io/ipfs/bafykbzacebl66lstaydak4muvwoui5e2ofwayncuqvmsg6fip6dbrs7igupoo?filename=advanced-analytics-with-pyspark-patterns-for-from—annas-archive—libgenrs-nf-3345607.pdf
- Finished date:: 2023-04-26
Highlights
So people using it interactively would need to remember this
After the data has been parsed once, we’d like to save the data in its parsed form on the cluster so that we don’t have to reparse it every time we want to ask a new question. Spark supports this use case by allowing us to signal that a given DataFrame should be cached in memory after it is generated by calling the cache method on the instance. Let’s do that now for the parsed DataFrame: parsed.cache() (View Highlight)