Giter VIP home page Giter VIP logo

cs3223-project's Introduction

CS3223-Project

Setting up the environment

A makefile was created to quickly set up the environment for testing. The following commands are as such:

make build runs build.sh to compile the necessary .java files.

make db will generate and convert .det files into tables.

make experiment is similar to make db but is used for the purposes of the experiments.

make clean will remove all files created by make db and make experiment

Features Implemented

List of JOINs Implemented

  • Block Nested Loops Join
  • Sort Merge Join

List of Operators Implemented

  • Distinct
  • Orderby
  • Groupby

Implementation of Block Nested Loops Join

The Block Nested Loops Join (BNJ) implementation is modified from the given Nested Loops Join implementation.
The block size for BNJ is calculated using number of buffers - 2.
In the open() method, a new cursor is initailised for the outer block and a new boolean is initialised for the end of block object.
In the next() method, we simulate a block using Java's ArrayList. For each block, we will iterate through each left page and right page, matching the left tuples to the right tuples.
Once the outbatch is full, we will keep track of the cursor of the block, left and right for the next iteration.

Implmentation of Sort Merge Join

The SortMergeJoin operator utilises external sorting on the left and right child operators. For each child, a run is sorted and generated on the join attributes. By exploiting the sorted order, both runs are iterated through in ascending order to find and join suitable tuples.

Implementation of Distinct

The Distinct operator is implemented using a sort-based approach, which also makes use of external sort. A sorted run is generated from its child operator, and it is read in again to remove duplicate tuples.

Implementation of Orderby

The Orderby operator uses the external sorting algorithm to order the attributes.

It sorts according to the list of attributes to order by and whether the tuples should be arranged in ascending or descending order.

Implementation of Groupby

The Groupby operator leverages the external sorting algorithm used for Orderby and Sort Merge Join to partition the desired groups. The conditions required to be fulfilled are:

  • Attribute in SELECT clause must appear in the Groupby Clause OR
  • Attribute is a primary key.

If the above conditions are not fulfilled, RandomInitialPlan will not allow the query to pass.

Bug(s) Fixed

Bug 1: Plancost of Nested Join is incorrect

  • The joinCost was calculated as leftpages * rightpages when it should be leftpages + leftpages * rightpages

Bug 2: Nested Join did not call close() on left and right child operators

  • NestedJoin::close() should call left.close() and right.close() so that all the operators in the left and right subtrees are recursively called, and cleanup of unused temporary files is done.

cs3223-project's People

Contributors

andlimey avatar lingzhiyu avatar yangjiyu98 avatar

Watchers

 avatar  avatar

Forkers

yangjiyu98

cs3223-project's Issues

Implement Join

Implement ONE of the following:

  1. Sort Merge Join (Requires implementation of sort-merge algorithm)
  2. Indexed Nested Loops Join (Requires an index to be built)
  3. Hash Join

Add new DataTypes in Attribute.java

Currently only Float, Int and String are supported.

Other data types required:

  • Time

Files to Modify:

  • Attribute.java
  • RandomDB.java
  • ConvertTxtToTbl.java

Implement Groupby or Aggregate Functions

For Group by, for each column A in relation R that appears in the SELECT clause, one of the following conditions must hold:

  • A appears in the GROUP BY clause
  • The primary key or R appears in the GROUP BY clause

List of Aggregate Functions:

  • Min
  • Max
  • Count
  • Avg

Implement Orderby

This should support both ASC and DESC. Parser needs to be modified to support this.

Add README

Include a README file that should cover sufficient information for the TA to figure out the implementation and understand the modifications easily.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.