@Juanita:Hi sure, Spark docs have good set of examples. Click on Python tab and you will see them in python. First example shows good basic use case, third example shows using partitioning and bucketing together.
@jamiet: Hi Jamiet, .persist(), .cache() and "CACHE TABLE foo" are different ways to use the same native spark caching methods. Native caching in spark can be used and is effective, especially in the ETL pipelines where you need to cache intermediate results. But you need to keep in mind that native caching doesn't work well with partitioned tables. Therefore more generic and reliable caching technique is storage layer caching.