This Project is bulit using PySpark and SparkSQL in Azure Databricks Platform.
- Ingested various forms of data CSV, JSON, Multiple Files using PySpark.
- Transformed the data(Filter, Join, Aggregation, Column Rename) using PySpark.
- Loaded the transformed data in Parquet file format and created Managed Table for analysis.
- Analysis using SparkSQL and created a Presentation Layer