Giter VIP home page Giter VIP logo

caustic's People

Contributors

ashwin153 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

caustic's Issues

RPC Interface

In order to do #6, a rich cross-language interface needs to be put in front of the transaction execution engine.

Cons Syntax is Verbose

Chaining multiple commands together in a cons is extremely verbose.

def block(a: Transaction, b: Transaction, rest: Transaction*): Transaction =
  rest.foldLeft(cons(a, b))((x, y) => cons(x, y))

This let's you say block(a, b, c, d) instead of cons(cons(a, b), cons(c, d)).

Collections Library

Add in an Map type into the Caustic language. This Map type would be an implementation of a B+ Tree, that would allow efficient bulk lookups of record attributes. For example, you could find all records where a certain field was less than or greater than a particular value.

Other collection types like List and Set can be implemented using a Map. For example, a List is a Map with integer keys and a Set is a Map from keys to themselves.

Macros

Insert runtime transactions directly into caustic programs

Syntax Redesign

Object

  • $fields
  • $reference

Dynamic Typing

  • x.foo.bar
  • x.foo is a reference to x.foo.bar
  • x.foo is a field
  • x.foo("bar") is the foo$bar field of x
  • x.foo("bar")("car") is the foo$bar$car field of x
  • x.foo("bar", "car") = x.foo("bar")("car")
  • Foreach (ctx.i, x.foob) iterates over th fields of x that are prefixed by foob
  • Delete(x) removes all the fields of x and propagates to all the references of x
  • Stitch(x) returns a json object containing all fields of x.

Static Typing

  • x('foo) x.foo
  • x('foo)('bar) = x.foo.bar

Import Resolution

Right now it is impossible to link together Caustic files. We'll need to add some dependency resolution logic into the compiler.

Caustic REPL

Create a website where you can write out transactions and run them on a database directly in the browser. Useful way to learn web development stuff, and a great tool for people to test out code. In practice, this will work just like the Caustic syntax unit tests.

Docker Integration

The runtime is executable, but it requires the system to be "just so" in order to function properly. Make the executable configuration file-based instead of programmatic, and get it to run within a docker container. This will make it really easy to spin up server instances.

Add Code Coverage

Lots of great tools out there, but might be challenging to integrate them with pants. The pants build repository is using coveralls and the foursquare open source project is using codecov.

Add None Type

The caustic-syntax package is desperately in need of a None type. There is currently no way to distinguish between no value and the empty string.

Remove Dependency on Akka

Users shouldn't have to depend on all of Akka just to schedule retries. Also, the syntax for creating an implicit ActorSystem is clunky. Much preferable to schedule TimerTasks on the underlying ExecutionContext.

Add Scrooge Support

In order to generate Thrift Scala code for #24, we must first add support for Scrooge.

./pants.ini

Scrooge requires a special pants plugin.

plugins: [
    'pantsbuild.pants.contrib.scrooge==%(pants_version)s',
]

./BUILD.tools

Scrooge requires a special generator and linter library.

jar_library(name = 'scrooge-gen',
    jars = [
        jar(org='com.twitter', name='scrooge-generator_2.12', rev='4.18.0', excludes=[
            exclude(org='org.apache.thrift', name='libthrift')
         ])
    ],
    dependencies=[
        '3rdparty:thrift-0.6.1',
    ]
)

jar_library(name = 'scrooge-linter',
    jars = [
        jar(org='com.twitter', name='scrooge-linter_2.12', rev='4.18.0', excludes=[
            exclude(org='org.apache.thrift', name='libthrift')
         ])
    ],
    dependencies=[
        '3rdparty:thrift-0.6.1',
    ]
)

./3rdparty/jvm/BUILD

Scrooge requires version 0.6.1 of Thrift.

jar_library(name='thrift-0.6.1', jars = [
    jar(org='org.apache.thrift', name='libthrift', rev='0.6.1')
])

./caustic-runtime/src/main/thrift/BUILD

java_thrift_library(
  name='scala',
  compiler='scrooge',
  language='scala',
  sources=rglobs('*.thrift'),
)

Caustic to SQL Converter

Write a program that converts Schema transactions into runnable ANSI SQL. Obviously, certain Schema transactions are extremely difficult or impossible to express in ANSI SQL. This is a useful way to compare Schema and SQL both for marketing and for debugging.

class Transaction {
  def toSQL: String = this match {
    case Read(k) =>
      s"""SELECT value WHERE key = "$k.toSQL""""
    case Write(k, v) =>
      s"""INSERT INTO table (key, value) VALUES ($k.toSQL, "$v.toSQL") ON DUPLICATE KEY UPDATE  key = "$k.toSQL" and value = "$v.toSQL""""
    case Literal(x) =>
      x.toString
    case Cons(x, y) =>
      s"""$x.toSQL; $.y.toSQL"""
    case Add(x, y) =>
      s"""$x + $y"""
    ...
    case default =>
      throw new UnsupportedOperationException("$default is inexpressible in SQL")
  }
}

Some operations like repeat, prefetch, load, store, and branch are inexpressible in SQL. For these operations, an exception will be thrown to indicate that the transaction cannot be parsed.

Benchmark and Stress Test

Before Schema can be safely used in production, we need to test its performance under various workloads (high read, high write, high contention, etc.). Performance is generally the main concern people have with this project. Benchmarks should be a top priority to convince people about the viability of the project.

  • Transaction throughput
  • Transaction latency
  • Allocations
  • Database size

Build File Style

The ./pants.ini file is getting way too complex. I think it'll be much easier to find issues if the different sections were alphabetized. Furthermore, we should adhere to Pants conventions like naming it publish.ivysettings.xml instead of ivy-publish.xml.

Service Discovery

Make it easier to bootstrap a Database server and simplify the process of connecting clients and executing transactions. Implement service discovery to allow servers to automatically register themselves and clients to automatically discover them. Service discovery can be easily implemented on top of ZooKeeper using standard Curator recipes.

Project Rename

Schema is really difficult to search for and is too prevalent in the database world.

Deletion Should Propagate to Indices

Suppose an object is referenced from an index. When the object is deleted, it should be removed from the index. This will make it so that you can store collections of objects and iterate over them. We'll associate with each object a list of indices that it is a member of.

Caustic Programming Language

A full fledged programming language, that compiles into thrift/http/protobuf interfaces and implementations with configurable storage engines that is served when run.

record NameOfRecord {
 bool x,
 double y,
 string z,
 NameOfOtherRecord bar,
 NameOfRecord& car,
}
service NameOfService { 
  def getAllUsers(foo: NameOfRecord): NameOfRecord  {
    if (foo.x) {
      val i = 0
      while (i < foo.y) {
        foo.bar.z = "3"
        foo.car.z = "3"
        i++
      }
    }

    return foo;
  }

}

Integrated Caching

Right now the TransactionalDatabase creates a new snapshot each time it executes a transaction. Instead, this snapshot should be an LRU cache that is invalidated on unsuccessful writes.

Add Scan to Database

If def scan(prefix: Optional[String]): Iterator[String] is added to Database, then you can do efficient filtration queries on keys. For example, SQL like queries of the form x LIKE 'abc%q' would translate to scan(r'abc').filter(_.matches('abc.*q')). You would also be to iterate over the key-space using scan(), which could be useful for implementing backfill/migration services on top of a Database.

Generic Conditional Put Implementation

Rather than rely on databases to provide a conditional put implementation, which can be difficult to implement in distributed databases, Caustic will provide a generic implementation.

Adaptive Placement

There are two kinds of conditional put operations, or transactions, that may be performed on a distributed database: distributed and local. Distributed transactions span multiple shards and require an expensive coordination operation to guarantee that transactions atomically commit or abort. Local transactions occur only on a single shard and do not require any coordination. Local transactions are clearly more efficient than distributed transactions. Therefore, the library should perform adaptive placement. Whenever a distributed transaction is performed, the library may decide to colocate keys so that future transactions will be local.

Links

http://people.csail.mit.edu/idish/ftp/JCSS.pdf
http://rystsov.info/2012/09/01/cas.html
https://arxiv.org/pdf/1509.07815.pdf
https://www.cockroachlabs.com/blog/how-cockroachdb-distributes-atomic-transactions/

Automatic Retries

Use the Backoff implementation in caustic-common to retry transaction conflicts. Implementation should not retry failures due to invalid transactions, etc. Furthermore, the backoff durations should be client specified, but server implemented. In other words, clients tell the server to execute a transaction with x backoff durations and the server actually performs the retried execution. If both things happened on the server, then the backoff durations would be fixed after the server starts. If both things happened on the client, then each retry would require an additional network round-trip and you would have to pay the cost of transaction serialization and parsing on each attempt.

Richer Literal Types

Right now all literals are of type String, this makes it very difficult for the transaction execution engine to provide (1) meaningful return values, (2) descriptive error messages, and (3) important functionality (appending to lists, etc.). It also makes the execution engine do unintuitive things (previously, adding True + True == 2 because True == 1) and complicated the evaluation of a transaction (previously, the prefetch operator took two steps to reduce because it was first translated to a list of read operators).

I propose the following base types (more complicated types can be constructed from them):

  • Boolean -> bool
  • Number -> double
  • Text -> string
  • Sequence -> array

Caustic Pants Plugin

Integrate the Caustic compiler with Pants to make it easier to compile and run Caustic programs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.