Giter VIP home page Giter VIP logo

eugene's Introduction

Hi, there. Here's some info about me.

Contact me
kristiyan stoimenov mailto:kristoimenov@gmail.com

Languages and Tools

Most of the time I am using these tehnologies:
c cplusplus python bash linux Rust

eugene's People

Contributors

boki1 avatar stoyantinchev avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

eugene's Issues

B-Tree bulk insertion

Description

Support an efficient mechanism for inserting multiple entries. This has multiple versions. The first one is the ability to create a tree from a "stream" of entries, whereas the other form is, adding a set of entries into an already instantiated tree index.

Design

FIXME

Reaserch

  • "Concurrency Control and I/O Optimality in Bulk Insertion" by Kerttu Pollari-Malmi and Eljas Soisalon-Soininen, 2015
  • "Modern B-tree techniques", ch. 6 by Goetz Graefe, 2010

Definition of Done

  • Feature implementation
  • Unit tests on it
  • Benchmarking
  • Validating improved performance over single-step insertions

Decompression algorithm implementation

Description

Create a decompression algorithm for decompressing previously compressed files with Huffman coding. Steps will be:

  • Initialize proper trie from the compressed file. Read how many folders/files the program is going to create inside the main folder.
  • Read n successive bits from the compressed file and store it in a leaf of the translation trie, after creating that leaf and sometimes after creating nodes that are binding that leaf to the trie.
  • And finally, create a function that translates a compressed file from info that is now stored in the translation trie then writes it to a newly created file.is not a number.

Core B-tree Functionality

Description

One of the core subsystems on which the whole system depends deals with indexing data. The information stored in Eugene is part of the index. This data structure is represented by a B-tree [1], [3] (multiple actually).

The generally accepted B-tree definition is as follows:
Def. B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time.

Also, other requirements which the tree has to meet in order to qualify as B-tree are that:

  • Every node has at most m children.
  • Every non-leaf node (except root) has at least ⌈m/2⌉ child nodes.
  • The root has at least two children if it is not a leaf node.
  • A non-leaf node with k children contains k − 1 keys.
  • All leaves appear at the same level and carry no information.

Then, we can call this tree a B-tree of order m (also encountered as a branching factor).

The major advantages of using b-trees over other data structures typically are:

  • Keeping keys in sorted order for sequential traversing
  • Using a hierarchical index to minimize the number of disk reads
  • Using partially full blocks to speed up insertions and deletions
  • Keeping the index balanced with a recursive algorithm

Alternatives

B-trees dramatically outperform other types of data structures when it comes to reading and writing relatively large blocks of data, i.e from consistent storage devices. Thus, it is a common choice for databases and filesystems.

A variation of b-trees, called b*-trees may have an additional improvement to the performance of the tree. [2]

Additional context

B-tree
B*-tree

[1] Bayer, R.; McCreight, E. (July 1970). "Organization and maintenance of large ordered indices"
[2] Comer, Douglas (June 1979), "The Ubiquitous B-Tree"
[3] Bayer, Rudolf (1971), Binary B-Trees for Virtual Memory

Compression algorithm implementation

Description

Realize algorithm for compression. Compression algorithms are normally used to reduce the size of a file without removing information. This can increase their entropy and make the files appear more random because all of the possible bytes become more common. The compression algorithms can also be useful when they're used to produce mimicry by running the compression functions in reverse.

Possible solution

For now, there are 2 compression algorithms that we want to realize with their basic functionalities for compression and decompression. Namely sparse column and sparse row.

Alternatives

For alternatives, I'm not sure but there are some good Huffman codding and encoding algorithms. There is also encoding that uses binary trees in trie format.

Additional context

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.