Data Skipping Sample for Scala
Tags
Decision Optimization
Modified
May 14, 2020

Learn how to improve SQL queries performance with the demonstration of the performance optimization technique of data skipping. Metadata is used to mark columns which have data that has no relevance to the analysis. All Spark native data formats are supported, including Parquet, ORC, CSV, JSON and Avro. This notebook runs on Spark and Scala 2.11.

Drag and drop files to add data source.