ashwin153 / caustic Goto Github PK
View Code? Open in Web Editor NEWA transactional programming language.
Home Page: https://madavan.me/projects/caustic.html
License: Apache License 2.0
A transactional programming language.
Home Page: https://madavan.me/projects/caustic.html
License: Apache License 2.0
In order to do #6, a rich cross-language interface needs to be put in front of the transaction execution engine.
Syntax package documentation is very inadequate.
Write a bash script to simplify the release of build artifacts to Maven Central.
Hook up the project to Travis CI to make sure we aren't breaking any tests on push and deploy.
Chaining multiple commands together in a cons is extremely verbose.
def block(a: Transaction, b: Transaction, rest: Transaction*): Transaction =
rest.foldLeft(cons(a, b))((x, y) => cons(x, y))
This let's you say block(a, b, c, d)
instead of cons(cons(a, b), cons(c, d))
.
Add in an Map
type into the Caustic
language. This Map
type would be an implementation of a B+ Tree, that would allow efficient bulk lookups of record attributes. For example, you could find all records where a certain field was less than or greater than a particular value.
Other collection types like List
and Set
can be implemented using a Map
. For example, a List
is a Map
with integer keys and a Set
is a Map
from keys to themselves.
In order to deal with large collections, implement a maximum number of keys prefetched at a time feature.
This build failure, is due to unit tests being run before the Postgres table is created. Add a latch to the unit tests so that they only run after setup is complete.
There's too much namespace overlap.
GitHub wiki pages are extremely incomplete and out-of-date.
Insert runtime transactions directly into caustic programs
Add support for indexOf
, toUpperCase
, toLowerCase
, and trim
.
Object
Dynamic Typing
Static Typing
Right now it is impossible to link together Caustic files. We'll need to add some dependency resolution logic into the compiler.
Create a website where you can write out transactions and run them on a database directly in the browser. Useful way to learn web development stuff, and a great tool for people to test out code. In practice, this will work just like the Caustic
syntax unit tests.
The runtime is executable, but it requires the system to be "just so" in order to function properly. Make the executable configuration file-based instead of programmatic, and get it to run within a docker container. This will make it really easy to spin up server instances.
Add support for Memcached, Redis, and Elasticache.
The caustic-syntax
package is desperately in need of a None
type. There is currently no way to distinguish between no value and the empty string.
Users shouldn't have to depend on all of Akka just to schedule retries. Also, the syntax for creating an implicit ActorSystem is clunky. Much preferable to schedule TimerTasks on the underlying ExecutionContext.
In order to generate Thrift Scala
code for #24, we must first add support for Scrooge.
./pants.ini
Scrooge requires a special pants plugin.
plugins: [
'pantsbuild.pants.contrib.scrooge==%(pants_version)s',
]
./BUILD.tools
Scrooge requires a special generator and linter library.
jar_library(name = 'scrooge-gen',
jars = [
jar(org='com.twitter', name='scrooge-generator_2.12', rev='4.18.0', excludes=[
exclude(org='org.apache.thrift', name='libthrift')
])
],
dependencies=[
'3rdparty:thrift-0.6.1',
]
)
jar_library(name = 'scrooge-linter',
jars = [
jar(org='com.twitter', name='scrooge-linter_2.12', rev='4.18.0', excludes=[
exclude(org='org.apache.thrift', name='libthrift')
])
],
dependencies=[
'3rdparty:thrift-0.6.1',
]
)
./3rdparty/jvm/BUILD
Scrooge requires version 0.6.1 of Thrift.
jar_library(name='thrift-0.6.1', jars = [
jar(org='org.apache.thrift', name='libthrift', rev='0.6.1')
])
./caustic-runtime/src/main/thrift/BUILD
java_thrift_library(
name='scala',
compiler='scrooge',
language='scala',
sources=rglobs('*.thrift'),
)
Memoize Goal
execution to avoid re-running it on the same source file.
Write a program that converts Schema transactions into runnable ANSI SQL. Obviously, certain Schema transactions are extremely difficult or impossible to express in ANSI SQL. This is a useful way to compare Schema and SQL both for marketing and for debugging.
class Transaction {
def toSQL: String = this match {
case Read(k) =>
s"""SELECT value WHERE key = "$k.toSQL""""
case Write(k, v) =>
s"""INSERT INTO table (key, value) VALUES ($k.toSQL, "$v.toSQL") ON DUPLICATE KEY UPDATE key = "$k.toSQL" and value = "$v.toSQL""""
case Literal(x) =>
x.toString
case Cons(x, y) =>
s"""$x.toSQL; $.y.toSQL"""
case Add(x, y) =>
s"""$x + $y"""
...
case default =>
throw new UnsupportedOperationException("$default is inexpressible in SQL")
}
}
Some operations like repeat
, prefetch
, load
, store
, and branch
are inexpressible in SQL. For these operations, an exception will be thrown to indicate that the transaction cannot be parsed.
Unused dependencies, and adopting arguments.
Add support for Python and possibly C++, Rust, and/or JavaScript.
Before Schema
can be safely used in production, we need to test its performance under various workloads (high read, high write, high contention, etc.). Performance is generally the main concern people have with this project. Benchmarks should be a top priority to convince people about the viability of the project.
Database
sizeImplement benchmarks for YCSB.
The ./pants.ini
file is getting way too complex. I think it'll be much easier to find issues if the different sections were alphabetized. Furthermore, we should adhere to Pants conventions like naming it publish.ivysettings.xml
instead of ivy-publish.xml
.
Make it easier to bootstrap a Database
server and simplify the process of connecting clients and executing transactions. Implement service discovery to allow servers to automatically register themselves and clients to automatically discover them. Service discovery can be easily implemented on top of ZooKeeper using standard Curator recipes.
Schema
is really difficult to search for and is too prevalent in the database world.
Suppose an object is referenced from an index. When the object is deleted, it should be removed from the index. This will make it so that you can store collections of objects and iterate over them. We'll associate with each object a list of indices that it is a member of.
A full fledged programming language, that compiles into thrift/http/protobuf interfaces and implementations with configurable storage engines that is served when run.
record NameOfRecord {
bool x,
double y,
string z,
NameOfOtherRecord bar,
NameOfRecord& car,
}
service NameOfService {
def getAllUsers(foo: NameOfRecord): NameOfRecord {
if (foo.x) {
val i = 0
while (i < foo.y) {
foo.bar.z = "3"
foo.car.z = "3"
i++
}
}
return foo;
}
}
Right now the TransactionalDatabase
creates a new snapshot
each time it executes a transaction. Instead, this snapshot
should be an LRU cache that is invalidated on unsuccessful writes.
If def scan(prefix: Optional[String]): Iterator[String]
is added to Database
, then you can do efficient filtration queries on keys. For example, SQL like queries of the form x LIKE 'abc%q'
would translate to scan(r'abc').filter(_.matches('abc.*q'))
. You would also be to iterate over the key-space using scan()
, which could be useful for implementing backfill/migration services on top of a Database
.
Error messages are abysmal. Use ANTLR Error Nodes to propagate errors.
Rather than rely on databases to provide a conditional put implementation, which can be difficult to implement in distributed databases, Caustic
will provide a generic implementation.
There are two kinds of conditional put operations, or transactions, that may be performed on a distributed database: distributed and local. Distributed transactions span multiple shards and require an expensive coordination operation to guarantee that transactions atomically commit or abort. Local transactions occur only on a single shard and do not require any coordination. Local transactions are clearly more efficient than distributed transactions. Therefore, the library should perform adaptive placement. Whenever a distributed transaction is performed, the library may decide to colocate keys so that future transactions will be local.
http://people.csail.mit.edu/idish/ftp/JCSS.pdf
http://rystsov.info/2012/09/01/cas.html
https://arxiv.org/pdf/1509.07815.pdf
https://www.cockroachlabs.com/blog/how-cockroachdb-distributes-atomic-transactions/
Add support for Cassandra, RocksDB, CockroachDB, DynamoDB, and SQL Server.
Use the Backoff
implementation in caustic-common
to retry transaction conflicts. Implementation should not retry failures due to invalid transactions, etc. Furthermore, the backoff durations should be client specified, but server implemented. In other words, clients tell the server to execute a transaction with x backoff durations and the server actually performs the retried execution. If both things happened on the server, then the backoff durations would be fixed after the server starts. If both things happened on the client, then each retry would require an additional network round-trip and you would have to pay the cost of transaction serialization and parsing on each attempt.
When running the database thread-safety test on MySQL and PostgreSQL the test sometimes passes and sometimes doesn't.
Because the database execution logic is implemented using head recursion, it causes a StackOverflowException
for large transactions.
Right now all literals are of type String
, this makes it very difficult for the transaction execution engine to provide (1) meaningful return values, (2) descriptive error messages, and (3) important functionality (appending to lists, etc.). It also makes the execution engine do unintuitive things (previously, adding True + True == 2
because True == 1
) and complicated the evaluation of a transaction (previously, the prefetch
operator took two steps to reduce because it was first translated to a list of read
operators).
I propose the following base types (more complicated types can be constructed from them):
Boolean
-> bool
Number
-> double
Text
-> string
Sequence
-> array
Integrate the Caustic compiler with Pants to make it easier to compile and run Caustic programs.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.