Giter VIP home page Giter VIP logo

big-data-sql-specialization's Introduction

Modern Big Data Analysis with SQL

More information and details regarding this program can be accessed through this link.

Prerequisites

Individuals are expected to have some familiarity working with relational databases and structured-query-language (SQL).

About

In this three-course specialisation we are introduced to querying big data using modern distributed SQL engines. Upon completion of this program, we can query huge datasets in clusters and cloud storage using newer breed of technologies like Hive, Impala, Presto and Drill.

This program is offered in collaboration with following industry partner:

Estimated Duration: 4 Months
Program Structure: Self Paced (Approx 3Hrs/Week)

Curriculum

1. Foundations for Big Data Analysis with SQL

In this introductory lesson, we get an overview of database systems and common query language. We learn to distinguish between operational and analytical databases and understand key design principles before working with our data. Later on, we learn the features and benefits of different SQL dialects and explore databases in a big data system using virtual configured setting.

2. Analysing Big Data with SQL

In this lesson, we focus on big data SQL engines like Apache Hive and Apache Impala. We learn how to explore and navigate databases using different tools and identify ways to group and aggregate data to answer analytic questions. We finally learn, how to combine data from multiple tables and realise explicit differences between relational database management systems (RDBMs) and modern query engines.

3. Managing Big Data in Clusters and Cloud Storage

In this lesson, we discover how to manage and load big datasets in distributed clusters and storage. We learn how to choose among the different data types, file formats and performance issues while working with our data in big data systems. We end this course, by learning how to optimise our queries and workloads in Apache Hive and Apache Impala.

Instructors

  • Glynn Durham - Senior Instructor | Cloudera
  • Ian Cook - Curriculum Developer | Cloudera

Technologies

  • SQL
  • Hive
  • Impala

big-data-sql-specialization's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.