jakekgrog / ghostdb Goto Github PK

GhostDB is a distributed, in-memory, general purpose key-value data store that delivers microsecond performance at any scale.

Home Page: http://www.ghostdbcache.com

License: BSD 3-Clause "New" or "Revised" License

Go 95.19% Makefile 1.86% Shell 2.83% PowerShell 0.11%

cache golang database datastore distributed-database in-memory-database

ghostdb's Introduction

Update - 03/03/2021

Where we've been

GhostDB stemmed from a University project. Due to the nature of these projects (time constraints etc.), we feel some corners were cut. For example, we opted for the memcached model of distribution to save on time as it was easier to implement. However, this wasn't the original vision of GhostDB. Myself and Connor also started new jobs and these took up a good chunk of our time. This combined with just finishing a really busy final year in Univeristy, we decided to mothball the project for a while. We're finally returning to it and hopefully transforming it into what we had originally planned.

A new roadmap

We are revising our roadmap below and plan to release an updated version soon but before we do here is a brief rundown on what we want

Transition away from the memcached model and move to a consistent, partition tolerant system (with limited fault tolerance too) by implementing the raft concensus protocol. (This is almost complete)
Release a CLI to allow users to easily manage their clusters
Re-build our SDKs from the ground up to allow users to interact with GhostDB with more ease than is currently possible.
Implement new data types to broaden GhostDBs use cases.
Local caching to give an even greater performance boost to users.
Release AWS Amazon Machine Images (AMIs) and Google Compute Engine Images to allow users to easily create GhostDB clusters in the cloud with only a few clicks.
Updates to the website that include a download centre and documentation improvements.

Contributing

Unfortunately, with work and life we simply don't have the time at the moment to manage pull requests from anyone else. However, we are still accepting issues and are encouraging them.

And of course, we also want to continue improving on our performance :)

📚 Overview

GhostDB is a distributed, in-memory, general purpose key-value data store that delivers microsecond performance at any scale.

GhostDB is designed to speed up dynamic database or API driven websites by storing data in RAM in order to reduce the number of times an external data source such as a database or API must be read. GhostDB provides a very large hash table that is distributed across multiple machines and stores large numbers of key-value pairs within the hash table.

🚗 Roadmap

GhostDB was a university project - it is not fully featured but we're getting there!

This is a high-level roadmap of what we want GhostDB to become by the end of 2020. If you have any feature requests please create one from the template and label it as feature request!

First hand support for list, set, stack and queue data structures
Atomic command queues
Subscribable streams
Monitoring & administration dashboard
Enhanced security features
Transition to TCP sockets as transport protocol
CLI
Support for a wide range of programming languages

🔧 Installation

To install GhostDB please consult the installation guide for a quick walkthrough on setting up the system.

🔨 Cluster Configuration

To configure a GhostDB cluster please follow the instructions in the configuration guide

✏️ Authors

Jake Grogan

Email: [email protected]
Github: @jakekgrog

Connor Mulready

Github: @nohclu

⭐ Show your support

Give a ⭐ if this project helped you!

ghostdb's People

Contributors

Stargazers

Watchers

ghostdb's Issues

Tidb CDC example integration

GhostDB would be a great match for databases that support CDC

You could use GhostDB to hold your data in a cdb network like cloudflare as a materialised view that is fed from tidb.

Here is an example that feeds data from tidb to Kafka to get an idea :
https://github.com/pingcap/ticdc/tree/master/kafka_consumer.

GhostDB would then act as the read only dB and TiDb cluster as the write dB.

This would be amazingly efficient I think.

How does GhostDB compare to ...

There are many KV stores, and many caches, and many in-memory things.

How does GhostDB compare? Can you add comparisons to memcache, redis, annakv.. etc to the website? Seems like that would be a FAQ.

Installer rework

Creating a user via the installer doesn't create the user dir. Requires manual creation in order for the service to run.

Benchmark result?

Compared to similar product: Redis, Aerospike, Tarantool, etc

Add issue and PR templates in /docs

Request history

We would like to be able to create a log of recent request history for the node (the last hour for example). These logs would contain information about what commands were executed, who the request came from (what client). Whether to request failed or succeeded etc. It is open for discussion here so please contribute if you've any good ideas!

Add get number of client connections

For the GhostDB CLI we need a way of counting the number of active connections to each node in a cluster.

Add support for Queue data type

Adding support for queues will allow for GhostDB to function as a message queue.

Keyspace analysis

We would like a way for our users to be able to perform analysis of the keyspace of a node. This would require keeping track of keys that are frequently used (keys with high number of hits; we need to define what constitutes a high number of hits, perhaps just the top x keys can be returned), the longest keys, shortest keys etc.

We would also like to be able to analyse the largest items in the cache. For example, a user might want to know the largest queue on the node, the longest string, largest set etc.

Again, this is up for discussion and I'm interested in hearing your thoughts!

keyspaceSize definition

Found this project via Hacker News. Looks awesome. Quick question:

“keyspaceSize”: 65536

Is that the number of items or number of bytes that can be stored? Typically you allocate X number of bytes of memory that can be used. Allocating the number of items seems strange as items will be variable length right?

config_reader.go
config_reader_test.go
INSTALLATION.md

Please add a license to this repo

First, thank you for sharing this project with us!

Could you please add an explicit LICENSE file to the repo so that it's clear
under what terms the content is provided, and under what terms user
contributions are licensed?

Per GitHub docs on licensing:

[...] without a license, the default copyright laws apply, meaning that you
retain all rights to your source code and no one may reproduce, distribute,
or create derivative works from your work. If you're creating an open source
project, we strongly encourage you to include an open source license.

Thanks!