Giter VIP home page Giter VIP logo

giga's Introduction

giga

Concurrent File I/O on Arbitrarily-Sized Files

Overview

giga is a concurrent file I/O library which seeks to ameliorate large-file memory problems.

The way in which files are accessed in many languages is to essentially mmap the contents of the file into RAM -- but doing so with large files present memory allocation problems that do not have an easy solution. Moreover, as of this project, there is a sparsity of file libraries in the public which implements file insert and delete abstractions -- if a file were to be treated as a string, repeatedly calling file operations which alters the file length will be very inefficient due to the implicit memcpy invocations. Finally, there is also a sparsity of file libraries which deal with concurrent file operations, which is becoming an increasingly desired feature in today's multiprocessor environment.

Thus, we need a file library which can:

  • read files in its entirety without needing to allocate the size of the file of memory space to do so,
  • handle file length changes gracefully and efficiently, and
  • handle concurrent file changes.

Installation

git clone https://github.com/cripplet/giga.git
cd giga
git submodule update --init --recursive
# test the lib -- this will take a VERY long time
make CONCURRENT=false PERFORMANCE=true test

Updating

git pull
git submodule foreach --recursive git checkout master
git submodule foreach --recursive git pull

Usage

An example can be found in tutorial/ and can be executed by running:

cd tutorial/
make
./tutorial.app

All header and library files in the tutorial are symbolically linked to tutorial/external/giga -- removing the symoblic link, and cloning giga (and installing dependencies, as per above) into the same place will not break the Makefile.

// create a new file ('+') in the case it does not exist, and open with read-write properties
std::shared_ptr<giga::File> f (new giga::File("new_file.txt", "rw+"));

// open a new client with write-only privileges
std::shared_ptr<giga::Client> c_1 = f->open(NULL, "w");

// open another client with the privileges of the file ("rw" in this case)
std::shared_ptr<giga::Client> c_2 = f->open();

// atomically inserts at the beginning of the file
c_2->write("prepend\n", true)

// atomically seeks to the beginning of the file, in a relative seek
c_2->seek(7, false);

// seek to 1 byte from the end of the file, in an absolute seek
c_1->seek(1, false, true);

// overwrite data here
c_1->write(" ");

// seek to the beginning of the file, in an absolute seek
c_2->seek(0, true, true)

// atomically erase "prepend"
c_2->erase(7);

// append to the end of the file
c_1->write("append");

// read from the file for at most 100 bytes
// " append"
std::cout << c_2->read(100) << std::endl;

// close files
c_1->close();
c_2->close();

// save the file
f->save();

More examples can be found in the tests directory, including usage of the performance test suite (tests/performance.cc).

Caveats

  • giga reserves the path /tmp/giga/ to store all intermediate files
  • probably will need to be tailored a bit to compile on Windows machines

Todo

  • tutorial
  • implement a better caching backend (will NOT break giga interface)
  • more code documentation
  • compare performance to the Worthmuller report

Contact

  • github
  • gmail
  • issues and feature requests should be directed to the project issues page

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.