Giter VIP home page Giter VIP logo

crunch's People

Contributors

shelnutt2 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

crunch's Issues

Add daemon plugin

Add support for a daemon plugin which can manipulate a table in order to perform maintenance operations. Initially this will be limited to file consolidation.

Two strategies that need to be examined are having the daemon use the handler interface to interact with the table structure, similar to handler_socket plugin. The second strategy is to link directly to ha_crunch library and manipulate the table outside of mariadb handler structure. The main issue with the second method is locks still need to be acquired in order to safely consolidate the files.

Support unsigned integers properly

Right now unsigned integers are not supported. This is simple because we are only checking the field type in getCapnpTypeFromField and build_row. We need to check if unsigned bit is set and create schema to store appropriately.

Add support for online table alters

The idea is to add support for online table alters by creating new version of the schema. We can then produce a migration of data on read. This would put the work on reads instead of causing a table rebuild when a column is changed. During file consolidation older data files can be upgraded to the latest version of the schema.

If column order is what is being changed nothing has to be done. Even if ondisk storage and capnp proto schema is in different order, the fields are set using the field interface, so no changes are needed.

If column name changes, then we can just create a new capn proto schema with the new names, but as long as order and data types are the same, then the data on disk doesn't need to change.

Changes are needed for:

  • ADD_COLUMN
  • ALTER_COLUMN_DEFAULT
  • ALTER_COLUMN_NULLABLE
  • ALTER_COLUMN_FORMAT
  • ALTER_COLUMN_STORAGE_TYPE

Implement:

  • check_if_supported_inplace_alter (TABLE *altered_table, Alter_inplace_info *ha_alter_info)
  • prepare_inplace_alter_table(TABLE *altered_table, Alter_inplace_info *ha_alter_info);
  • inplace_alter_table(TABLE *altered_table, Alter_inplace_info *ha_alter_info)
  • commit_inplace_alter_table(TABLE *altered_table, Alter_inplace_info *ha_alter_info, bool commit);
  • notify_table_changed();

Add support for Updates

Add support for Updates.

Updates can be done in place by creating the new message and just memcpy over the old record.

New design:

Updates are deletes and a insert. See #18

This needs #13

Add order by for tests to work around travis ci

Add order by for tests to work around travis ci failures. Sometimes the resulting transaction files are ending in a different order in the travis ci vm. I am unable to reproduce this locally, in docker or in kvms. For now to prevent the false test failures, we will add order bys to test with multiple results.

Convert column names to capnp format

Cap'n proto does not support underscore in field names and enforces camel case. Need to convert column names which often have underscores to camel case.

Investigate flaky mediumint to bigint test

The alter table test with multiple inplace alters that changes a mediumint to a bigint is failing in travis ci. This failure can not be preproduced anywhere else but travis ci. It was disabled, but should be investigated.

25b4745

Sample failure: https://travis-ci.org/Shelnutt2/crunch/builds/343135301

exception on rnd_next ./test/t1: capnp/layout.c++, line: ../storage/crunch/src/crunch.cpp:346, exception_line: 2159, type: 0, e.what(): expected boundsCheck(segment, ptr, ref->structRef.wordSize()); Message contained out-of-bounds struct pointer.
2018-02-18 22:32:29 139966787450624 [ERROR] mysqld: Got error -44 "Internal error < 0 (Not system error)" from storage engine Crunch

Switch CI tests to use docker image for build depenedcies

Switch CI tests to use docker image for build dependencies. This will prevent having to compile capnp proto for every test, which adds 5-10 minutes to the testing. We can also skip the ubuntu dependencies and increase testing to other gcc/clang version with little overhead.

Add transaction support

Add transaction support.

This involves storing transaction data for deletes and writes in new files. On commit of a transaction the files are closed and moved from a transactions working directory to the main table folder.

Need to clear out transaction folder on startup.

Move data files into data folder

We should move data files into data folder. This would make it easier to atomically consolidate files. Right now we rename/move all existing files into a "consolidate folder", then we rename the new data file from transactions to the main folder than we delete the consolidate folder. Its too make operations, and not atomic enough in case of failure.

What we want to do is create data_dirs and then just symlink from data_dir_X to data. This can be done atomically:
http://blog.moertel.com/posts/2005-08-22-how-to-change-symlinks-atomically.html

Add advanced data/delete file support

Initial delete support was added in #18. Initial writes in #1.

This issue is to extend the data/delete support to support multiple files containing the data and deletes. The basic idea is anytime an update or delete statement is called, we will end writing new rows to the data file, start a new data file and move deletes to the new file.

Handle crash recovery

By nature of the using write once files per transaction, we only have up to two atomic operations per table during transaction commits. One rename of the data file, one rename of the delete file.

From a single table perspective (simplest transaction), if a crash were to occur after only a single atomic operation were to complete, on start up we must rollback and delete this partial transaction. The transaction can be determined by comparing the data folder to the transactions folder. Since all transaction files have the same name and only differ in extension it is possible to find the partially committed transaction.

This must be done only on crash recovery as during normal operation it is easily possible to encounter this half committed transaction if a transaction is mid-flight.

Transactions can span across multiple tables, and in that case the same basic operation can occur. All tables get the same transaction id (epoch_nanosections+uuid). Thus checking across all tables for any partial commits is doable. This however only works if the transaction involved two or more crunch tables. If this is cross engine, we can not do this effectively without XA commits.

Setting Default Value for Capnp Schema

Marko was able to show me how innodb sets the default value in their instant add column. They basically call ->set_default on the field, then they read the field value. We should be able to do the same.

Add testing framework

Need to add support for a testing framework. We need the ability to add mysql tests for this storage engine.

myisam handles with with test executables, however I'd like to use the mysql testing platform, so tests can be run inside the database to ensure all functions are implemented correctly.

Speed up CI builds

CI builds can be speed up if we limit what we build with mariadb.

  • disable tokudb
  • disable rocksdb
  • disable mroonga
  • disable spider
  • disable sphinx
  • disable federated
  • disable federatedx
  • disable connect
  • disable opgraph

Support multiple charsets

Capn'proto expects all text to be in utf-8. Right now when we get blob or char/varchar data we assume it is in utf-8. Instead of assume, a conversion should be done.

Consolidation Locking Enhancements

Right now table locks are all that is supported. When #13 is implemented, we will have more fine grain control. This issue will happen after #13.

The basic idea is that we can used shared read only locks for everything but deletes. For deletes, if we want to implement the possibility of read only locks, we'd need to keep a mapping of old file, old position to new file, new position, so we can port the deletes that happen while consolidation is running. The commit phase always has to be an exclusive lock.

Timestamp to Datetime conversions incorrect

Timestamp to Datetime conversions that happen in message upgrades are not correct. Timestamps are stored in UTC and mysql converts it to server timezone after select. During a normal offline alter table timestamps are converted to server timezone then written to the datetime field. We should do the same.

Add support for info()

For info() we can compute data file size by adding a new parameter to the data_struct that has the file size. We can then also maintain a class variable for total size. This can be updated on each run of findTableFiles.

Add support for deletes

Delete support is needed. Deletes can be done in multiple ways.

  1. Add a delete indicator to the record, this essentially would be a update with a hidden field.
  2. 0 out the entire message from the file, this means we have to handle gaps in messages when reading/scanning file.
  3. Maintain a separate file with the list of deleted rows.

"1)" or "2) "are pretty equivalent. The advantage to 2 is during a table scan one does not have to parse the message only to find it has been deleted. It might also be that option 1 makes roll back easier. However with 1 or 2 we still need to maintain a list of the ongoing rows touched in the transactions.

"3)" Does not seem to have a large benefit. If we keep the rows separate, then we just have to read that into memory and still do a comparison. The only upside compared to 1, is we don't have to parse capnp proto message to see if it is deleted or not, we can store the file offset and skip that way.

With option 2) We can also have a daemon process that periodically reorganizes a table that is closed, so zero'ed out space that is not a message is removed, truncating the file size.

Advanced inplace alter table support

in #50 online (inplace) alter table support was added for column additions, name changes and dropping columns.

Changing column datatypes to non-compatible datatype (e.g. int to a string) requires a rebuild. This should be handled with online alters. The basic requirements are to create conversion process from one data type to another, i.e itoa, or atoi. Then on reads this will be converted on the fly. When a consolidation happens the data on disk should be updated so the conversion is no longer required.

Update documentation - Create docs folder

Update documentation - Create docs folder.

  • Move arch to (docs/Architecture.md)[docs/Architecture.md]
  • Add feature checklist to readme
  • Add limitations to readme
  • Add non-production disclosure to readme

Consider storing more field details in schema

Consider changing on disk cap'n proto schema so that each column is stored as a field struct. There is a lot of metadata we are not storing right now, such as nullability of a column, or default values. Default values could be stored if they are a constant but if they are an expression we must rely on mariadb for storing these values in the .frm file.

In order to do auto table discovery, we have to store everything needed in the table data (data files or schema files) ourselves.

The limitation of inplace later tables when adding a column with a default and nullable (and it always being null for existing rows) is a because we are not storing this information.

The downside here, is right now each row is stored simply, does it really make sense to pack all this extra information into each and every single row? It would great increase disk space, and processing of data that is constant (for the schema version).

Perhaps we introduce a new data file that goes along with the schema, and contains the metadata of the table? We keep the "rows" struct compact and simple, and we create new struct to present the table and all the metadata. The advantage this way is the data files stay compact, we only have to write "table metadata" once per schema change. The downside is in any single row data file you'd be missing the data required for logic, such as if a column has no value but uses a default expression. New struct/metadata files also increases the complexity. Right now storage is simple. The downside is virtual columns are not supported.

I'm leaning toward new struct and new data files. It does not make sense to expand the row structure so that every column is a field struct and contains all metadata, it'd be a massive amount of duplication, and we have no compression (yet)!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.