Giter VIP home page Giter VIP logo

startupdb's Introduction

startupDB

startupDB is a database designed to create REST APIs. It is implemented as an Express middleware function and allows for easy implementation of persistent data endpoints. It features protection from dataloss during hardware failure by persisting individual operations in JSON files and offers high performance by serving all data from memory.

Its CRUD operations map directly to POST, GET, UPDATE/PUT/PATCH and DELETE methods.

Usage

Example

const express = require('express')
const startupDB = require('startupDB')

const app = express()
app.use("/myDB", startupDB.db)
const server = app.listen(3000)

This will create a database under the myDB directory. Every endpoint that starts with /myDB will translate to a collection with the same name. So localhost:3000/myDB/user will implement POST, GET, PUT, DELETE and PATCH endpoints to create, find, update, delete and change user documents. Data will be persisted to disk in checkpoint/user and oplog/user directories.

Methods

StartupDB implements the following methods on all endpoints:

GET

The GET method retrieved data from the database. Retrieving data from a non existing collection will result in a 200 OK and an empty response, it will not return a 400 error.

no parameters

GET localhost:3000/myDB/user will return all documents in the collection.

id parameter

GET localhost:3000/myDB/user?id=peter will return the document with id == 'peter'.

filter parameter

GET localhost:3000/myDB/user?filter=lastname=="Smith" will return all documents with lastName == 'Smith'.

The filter parameter supports sandboxed javascript expressions as implemented by filtrex.

offset and limit parameters

GET localhost:3000/myDB/user?offset=10&limit=10 will return documents 11 - 20.

returnType parameter

returnType parameter can be object, checkPoint or array (default)

GET localhost:3000/myDB/user?returnType=object will return all documents as an object using the id field as a key.

GET localhost:3000/myDB/user?returnType=checkpoint will return all documents as stored in the checkPoint including metadata. The nextOplogId in the metadata can be used for oplog polling.

POST

The POST method adds new documents to the database. POSTing data to a non existing collection will create the collection. The body can contain one object or an array of objects. If the objects have no id property, one will be added to each document containing a version 4 UUID string.

If a document is POSTed with an id that already exists in the collection, a 409 conflict error will be returned. To update an existing document, use the PUT or PATCH methods.

PUT

The PUT method replaces existing documents or created new documents to the database. PUTing data to a non existing collection will create the collection. The body can contain one object or an array of objects. If the objects have no id property, one will be added to each document containing a version 4 UUID string. If a document exists in the collection with an id mentioned in the body of the PUT, the document will be replaced with the new document.

DELETE

The DELETE method removes documents from the database.

id parameter

DELETE localhost:3000/myDB/user?id=peter will delete the document with id == 'peter'.

filter parameter

DELETE localhost:3000/myDB/user?filter=lastname=="Smith" will delete all documents with lastName == 'Smith'.

The filter parameter supports sandboxed javascript expressions as implemented by filtrex.

PATCH

The PATCH method updates documents in the database. The body can contain one object or an array of objects. If the objects have no id property, one will be added to each document containing a version 4 UUID string.

jsonpatch

PATCHes can be performed by jsonpatch. This allows for lightweight, finegrained updates on large objects. To use jsonpatch the objects in the body should follow this schema:

{
    "id":string
    "patch":array
}

Object.assign

If the object has any other schema, the PATCH will be performed by javascript Object.assign

API

const startupDB = require('startupDB')

startupDB([options])

Returns the database middleware using the given options

Options

The options object allows you to configure the following options:

Option Name Type Default value Description
dataFiles string './' Path to data directory
validator function undefined Function to validate schema
addTimeStamps function undefined Function to add create/modified timestamps.
opLogArchive string undefined Path to archive directory
serveRawCheckpoint boolean false Stream checkpoint to client, does not keep resource in memory
streamObjects boolean false Stream json repsonse to client, does not block the event loop even for large payloads

Schema validation

A schema validator can be passed using the options.validator function.

Your function should implement the following interface:

/*
 * @param {string} operation: "created" or "modified"
 * @param {object} document: the document to change
 * @return false | array: false or an array with error messages
 */
validator(collection, documents)

Timestamps

startupDB can auto-timestamp your documents using the options.addTimeStamps function.

This function will be called when documents are created or modified. The timestamp function wil be called before your documents will be validated so make sure your schema includes your timestamps when you use te optional schema validation.

Your function should implement the following interface:

/*
 * @param {string} operation: "created" or "modified"
 * @param {object} document: the document to change
 * @param {object} oldDocument: the old document (before modify)
 */
function(operation,document,oldDocument)

Example

function (operation, object, oldObject) {
        if (operation == "created") object.__created = new Date().getTime()
        if (operation == "modified") {
            object.__modified = new Date().getTime()
            if (oldObject) object.__created = oldObject.__created
        }
    }

Hooks

startupDB support databasehooks to run endpoint specific code either before or after the CRUD operation. They can be used for everything from authentication to data conversion.

A 'before' hook should implement the following interface:

/*
 * @param {object} req: like in Express
 * @param {object} res: like in Express
 * @param {function} next: like in Express
 * @param {string} collection: the name of the collection
 *
 * @return:  {"statusCode":<HTTP StatusCode>,"data":<response body>,"message":<status message>}
 *
 * return {"statusCode":0} when there are no errors
 *
 */
function(req, res, next){
    return {
        "statusCode":200,
        "data":{
            "name":"value"
        },
        "message":"OK"
        }
}

An 'after' hook should implement the following interface:

/*
 * @param {object} req: like in Express
 * @param {object} response: response object from database
 *
 * @return:  {"error":<HTTP StatusCode>,"data":<response body>,"message":<status message>,"headers":<response headers>}
 *
 * Omit the error property in the response when there are no errors
 */
function(req, response){
    return {
        "data":response.data
        }
}

Commands

startupDB supports several commands that can be executed by sending a POST or GET request to the root.

For example:

curl --header "Content-Type: application/json" \
  --request POST \
  --data "{\"command\":\"purgeOplog\",\"collection\":\"00000/sku\"}" \
  http://127.0.0.1:3000/data
Command Method Function Parameters
GET List all collections.
create POST Create collection, errors when it already exists. collection:"string", storageType:"array" or "object"
drop POST Removes a collection from memory, oplog and checkpoint directories. collection:"string"
ensureCollection POST Create collection if it does not exist, no error if it does. collection:"string", storageType:"array" or "object"
flush POST Create checkpoint and flush oplog. collection:"string"
options:{archive:true/false} mandatory when opLogArchive is used
inspect POST return totalNrObjects in memory.
purgeOplog POST remove all operations from opLog, restoring collection to previous checkpoint.
This is usefull for implementing tests.
Collection parameter can be "*" to purge all collections or a comma separated string of collections.
collection:"string"

startupdb's People

Contributors

bitsplitter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

startupdb's Issues

Server crashes

When an operation is performed on a collection that is among the least recently used collection, the GC may decide to delete it causing the application to fail because it expects the data to be available.

We need to 'touch' the collection before the actual transaction is written to keep the GC from interfering.

StartupDB throws an error when the updateObjects payload is undefined

When I call updateObjects without a payload, I get the following error:

STARTUPDB Error TypeError: Cannot read properties of undefined (reading 'id')
at addIdsToItemsThatHaveNone (/Users/jeroen/LatestCollectionServer/node_modules/startupdb/dist/server.js:117:19)
at dbUpdateObjects (/Users/jeroen/LatestCollectionServer/node_modules/startupdb/dist/server.js:569:5)
at async /Users/jeroen/LatestCollectionServer/dist/hooks/beforeGetCampaign.js:24:5
at async processMethod (/Users/jeroen/LatestCollectionServer/node_modules/startupdb/dist/server.js:692:24)
at async /Users/jeroen/LatestCollectionServer/node_modules/startupdb/dist/server.js:772:20

As a developer, I would like a more relevant error that helps me pinpoint the issue quicker.

StartupDB hangs on an malformed URI

When a malformed url is passed, an exception is thrown that is not caught anywhere crashing the request and leaving the requester without a response.
Better would be to try Decoding an return a 400 error when it fails

Remove chalk library

Chalk library output on the console is hard to read when analysing raw log files from CI/CI pipeline.
I would prefer plain log files.

x-last-oplog-id header returned by HEAD can be invalid

When using the HEAD method to get the latest update information on a collection, the x-last-oplog-id header will return a wrong id when there are oplog files with different filename lengths.
Oplog files are named .json, when retrieving all files from the oplog directory, at least on Windows, files are return sorted, that is, alphabetically. That does not give us the most recent file when not all filenames are the same length.

Server can crash during GC

Due to the inaccuracy in memory footprint estimation, GC can draw the wrong conclusion about available memory and evict all collections, even the ones that are in the process of being read form disk.

Collections are garbage collected too often

usedBytesInMemory is updated with a wrong number when a collection loads into memory.
As a result, the garbage collection decides to evict collections for no good reason.

Error during near-simultaneous access

When accessing a resource that is not in memory yet, with multiple requests simultaneously, different errors are thrown, all boiling down to the resource not (completely) loaded in the in-memory cache.

Add support to delete by id or filter

As a developer, I would like to use the DELETE method in a similar way as the GET method.
So it should support the id and filter parameters just as the GET method does.

It should DELETE a single object matching the id or it should DELETE all objects matching the expression in the filter.

Allow Object.assign on PATCH

Currently, the PATCH method only supports jsonpatch. I'd like to be able to merge my update with an existing document, only passing the new or updated properties instead of having to replace the entire document with a PUT.

If the PATCH body does not contain a patch array, apply Object.assign to it to merge the source and target objects.

Negative RAM measurement

While testing the RAM usage of the startupDB, I've got in some how a negative RAM usage which is not possible to be. So please fix this bug.

usedBytesInMemory: -157573965

Make memory management configurable

In order to make startupDB more generic, allow to configure its memory management.

DB_CACHE_OVERHEAD_RATIO Determines the predicted memory footprint of JSON objects. The default value of 2 seems to work well in existing deployments but in more specific use-cases, another ratio might work better, for example if there are very deeply nested objects that have higher memory overhead.

DB_CACHE_FRACTION Determines the fraction of memory used for the DB cache. If the NodeJS-Express API that embeds StartupDB needs a larger fraction of available memory allocated to NodeJS via the โ€”-max-old-space-size parameter, this fraction needs to be configurable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.