Preface Part I. Gentle Overview of Big Data and Spark 1. What Is Apache Spark? Apache Spark's Philosophy Context: The Big Data Problem History of Spark The Present and Future of Spark Running Spark Downloading Spark Locally Launching Spark's Interactive Consoles Running Spark in the Cloud Data Used in This Book 2. A Gentle Introduction to Spark Spark's Basic Architecture Spark Applications Spark's Language APIs Spark's APIs Starting Spark The SparkSession DataFrames Partitions Transformations Lazy Evaluation Actions Spark UI An End-to-End Example DataFrames and SQL Conclusion 3. A Tour of Spark's Too1set Running Production Applications Datasets: Type-Safe Structured APIs Structured Streaming Machine Learning and Advanced Analytics Lower-Level APIs SparkR Spark's Ecosystem and Packages Conclusion Part II. Structured APls--DataFrames, SQL, and Datasets 4. Structured API Overview DataFrames and Datasets Schemas Overview of Structured Spark Types DataFrames Versus Datasets Columns Rows Spark Types Overview of Structured API Execution Logical Planning Physical Planning Execution