Preface Chapter 1:Big Data Analytics with Java Why data analytics on big data? Big data for analytics Big data - a bigger pay package for Java developers Basics of Hadoop - a Java sub-project Distributed computing on Hadoop HDFS concepts Design and architecture of HDFS Main components of HDFS HDFS simple commands Apache Spark Concepts Transformations Actions Spark Java API Spark samples using Java 8 Loading data Data operations - cleansing and munging Analyzing data - count, projection, grouping, aggregation, and max/min Actions on RDDs Paired RDDs Saving data Collecting and printing results Executing Spark programs on Hadoop Apache Spark sub-projects Spark machine learning modules Mahout - a popular Java ML library Deeplearning4j - a deep learning library Summary Chapter 2: First Steps in Data Analysis Datasets Data cleaning and munging Basic analysis of data with Spark SQL Building SparkConf and context Dataframe and datasets Load and parse data Analyzing data - the Spark-SQL way Spark SQL for data exploration and analytics Market basket analysis - Apriori algorithm Implementation of the Apriori algorithm in Apache Spark Efficient market basket analysis using FP-Growth algorithm Running FP-Growth on Apache Spark Summary Chapter 3: Data Visualization Data visualization with Java JFreeChart Using charts in big data analytics Time Series chart All India seasonal and annual average temperature series dataset Simple single Time Series chart