ryanfitz / vogels Goto Github PK

View Code? Open in Web Editor NEW

698.0 26.0 125.0 376 KB

DynamoDB data mapper for node.js

License: Other

Makefile 0.42% JavaScript 99.58%

vogels's People

Stargazers

Watchers

Forkers

trevorrowe ericyang pwani whiteboardmonk nordstrom vishnuvaradaraj sazpaimon nickstamas chadkouse dvonlehman benlynchnz sebdah yaplas joshuaconner craftzdog wilsonwc rvdwijngaard rebel-labs sogawa gadelkareem shouttome otolab dannyrscott damienwhaley dvshishkov viniciusps2 hasantayyar issfire justhamade qloan devinus incubatio dallasread osi-bhushankumar defconcepts kschotter khebbie aramamurthy mintzuhaha rohithpr adrianocola oted ruddiculous mitocgroup chadsmith sylwit icyc9 clarkie alfredogt jeffm13 hideyuki mexisme shalomvolchok ajagnanan evandroguedes darthapple justinmchase shaunbruno musnit senthily88 eugenehp tiant167 andyday artishan casualuser lionfish7699 paultyng fernandezgom fenrirunbound ngrices alessh alexsroussi ruimiguelsantos blaw2422 endymion00 catwarrior sonagrigoryan andrewcurioso systenics-sunny aneilbaboo sparkyfen charleswhchan gulabry sga-sakai-keita fingerfoodstudios mtomic jgoldstrom lrettig pushpay kirkins hernandazevedo prototypeme tonykung06 koaninc dropoff-co emmanuelsilva bhuwang279 oranie sam-white armiiller

vogels's Issues

Item.save does not respect timestamps

I noticed that Item.save is not setting the updatedAt column for items with a schema of timestamps = true.

[improvement] Object schema type

Although DynamoDB does not directly support attributes of type Object, it would be nice if vogels supported schema types for Object & ObjectSet where vogels would serialize/deserialize the object's JSON representation to/from a DynamoDB string attribute type.

Make logging optional

I'm using this module and it generates noise in our logs when it creates a table. I'd like to have the ability to disable this logging. Would you be open to a pull request that uses the debug module for enabling logging based on environment variable settings?

2.0.0 Release Notes

Summary

vogels v2.0 introduces a new api for defining models. The new define api brings full JSON document support to vogels, including support for deeply nested documents. DynamoDB Expression apis are exposed as options to all model methods. In order to support JSON documents the update api call has been switched to using UpdateExpressions from the deprecated AttributeUpdates api. The riskiest part of migrating to this version will be testing out your update operations.

Upgrade time: low - a few hours to a day for most users
Complexity: low - small list of well understood, easy to locate, and trivial to modify
Risk: moderate - update item calls switched to using DynamoDB UpdateExpressions api
Dependencies: moderate - requires using latest 2.1.x AWS-SDK and Joi 5.x.x.

Breaking Changes

New Features

Full JSON document support, including support for List, Map, and BOOL datatypes. Closes #16, #44
New timestamps config option to automatically add the attributes createdAt and updatedAt when defining a model. These attributes will get automatically get set when you create and update a record.
Flexible schema configuration to allow unknown, dynamic attributes both on top level and nested records
Ability to fully configure global and local secondary indexes. Allows to configure index names, attribute projection settings and index throughput. Closes #43 , #48
adding deleteTable api to remove the table from DynamoDB. Closes #10
New integration test suite and integration with travis-ci.
100% code coverage

Bug Fixes

CreateTables checks if error occurs while attempting to create a table. Fixes #41
Fixed error handling when streaming query, scan and parallel scan requests.
Fixed retry handling when running query and scans. Fixes #45

Updated dependencies

Joi to v5.1.0
aws-sdk to v2.1.8 (Closes #49 )
async to v0.9.0
lodash to v3.1.0

Migration Checklist

Define a Model

The api for defining a model has been changed to accept a block of json configuration.

var Joi = require('joi');

var Account = vogels.define('Account', {
  hashKey : 'email',

  // add the timestamp attributes (updatedAt, createdAt)
  timestamps : true,

  schema : {
    email   : Joi.string().email(),
    name    : Joi.string(),
    age     : Joi.number(),
    roles   : vogels.types.stringSet(),
    settings : {
      nickname      : Joi.string(),
      acceptedTerms : Joi.boolean().default(false)
    }
  }
});

Secondary Indexes

Both local and global secondary indexes are defined when defining the model.

var BlogPost = vogels.define('Account', {
  hashkey : 'email',
  rangekey : 'title',
  schema : {
    email             : Joi.string().email(),
    title             : Joi.string(),
    content           : Joi.binary(),
    PublishedDateTime : Joi.date()
  },

  indexes : [
    { hashkey : 'email', rangekey : 'PublishedDateTime', type : 'local', name : 'PublishedIndex' },
    { hashkey : 'title', rangekey : 'email', type : 'global', name : 'TitleEmailIndex' },
  ]
});

support Global Secondary Indexes

Status for the IN operator?

What is the status regarding the IN operator?

The IN operator is documented on the main page in the scan section, but apparently the corresponding test is still disabled.

I have tried to use the IN operator based on the example code, but it raises a ValidationException.

Executing this:

'use strict';

var AWS = require('aws-sdk');
var vogels = require('vogels');
var Joi = require('joi');

AWS.config.update({
    accessKeyId: 'accessKeyId',
    secretAccessKey: 'secretAccessKey',
    region: 'region'
});

vogels.dynamoDriver(new AWS.DynamoDB({ endpoint: 'http://localhost:8000' }));

var Account = vogels.define('Account', {
  hashKey : 'email',

  // add the timestamp attributes (updatedAt, createdAt)
  timestamps : true,

  schema : {
    email   : Joi.string().email(),
    name    : Joi.string(),
    age     : Joi.number(),
    roles   : vogels.types.stringSet(),
    settings : {
      nickname      : Joi.string(),
      acceptedTerms : Joi.boolean().default(false)
    }
  }
});

Account.config({tableName: 'AccountsTable'});

vogels.createTables({
  'example-Account'  : {readCapacity: 1, writeCapacity: 10},
}, function (err) {
  if(err) {
    console.log('Error creating tables', err);
    process.exit(1);
  }

  Account.create({email: '[email protected]', name: 'Foo Bar', age: 21}, function (err, acc) {
      console.log('created account in DynamoDB', acc.get('email'));
      Account
          .scan()
          .where('name').in(['[email protected]', '[email protected]'])
          .exec(function (err, accounts) {
            console.log('Error:', err);
            console.log('Accounts:', JSON.stringify(accounts));
          });
    });
});

Gives this output:

created account in DynamoDB [email protected]
Error: { [ValidationException: One or more parameter values were invalid: ComparisonOperator IN is not valid for L AttributeValue type]
  message: 'One or more parameter values were invalid: ComparisonOperator IN is not valid for L AttributeValue type',
  code: 'ValidationException',
  time: Wed Mar 18 2015 16:24:17 GMT+0100 (W. Europe Standard Time),
  statusCode: 400,
  retryable: false,
  retryDelay: 0 }
Accounts: undefined

Should the IN operator be removed from the documentation if it is not possible to use it?

describe table

Issue with package.json dynamodb-doc dependency

When I npm install vogels (2.0.0-rc9), the package.json under node_modules/vogels has this dependancy:

"dynamodb-doc": "git+https://github.com/awslabs/dynamodb-document-js-sdk"

It requires git and causes problems when deploying to Elastic Beanstalk on AWS.

create table

update joi version

The current version used is extremely out of date. This is however, potentially a major breaking change

Attribute Projection for Local Secondary Indexes

It would be great if the schema also allowed for definition of local secondary index attribute projection, in a similar way to the global index projections.

Configuration for local DynamoDB

How might I configure this package to use a locally running DynamoDB (based on http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Tools.DynamoDBLocal.html)?

Need to validate data when doing Table.update

The Table.create method uses joi to check that all of the supplied data is valid before attempting to serialize it and write it to the database (table.js line 116). However, Table.update does not do any data validation. This can result in invalid data being saved to the db. It can also cause a crash since some of the serializer code assumes that it is working with valid data. (For example, if the user performs an update with an invalid Date, a later call to Date.toISOString will raise an uncaught error.)

data export

Currently, backing up data in DynamoDB is very challenging. I want an export feature to run a parallel scan and stream the json data to a writeable stream.

Better retry logic

It looks like vogels implements retry logic if a request fails, but it seems to just hammer the dynamoDB api until it either fails or gets a success. According to amazon something like an exponential backoff should be followed when getting an error that is retry-able.

References:
Vogel Retry Logic
Amazon Exponential Backoff

Thoughts?

Need an "add" method - which only creates a new item if another one doesn't already exist with the same key

in DynamoDB's putItem you pass

{
  "Expected" : {
    "someKey" : {
      "Exists": false
    }
  }
}

Support BatchGetItem on multiple tables

BatchGetItem right now only supports doing multiple GetItem calls on a single table. It would be nice if it also supported doing GetItem calls on multiple tables as well.

Not sure how the design of this should be, though, since all operations are functions inside a specific table object.

Sanitization like Mongoosejs

What's your thoughts on adding something similar to what mongoose has that only saves input that is a part of the schema? I need to sanitize input before saving the model using vogels. I thought about using something like this using lodash, which would work, but it would be nice if this was built into vogels.

var _ = require('lodash');

// in a user create express route...
var userData = _.pick(req.body, 'email', 'name', 'username', 'title');
User.create(userData, function (err, user) {
  console.log('created account in DynamoDB', user.get('email'));
});

This would only save the email, name, username, and title to the user. Even if something like admin: true was in the body.

I know mongoosejs has a strict implementation that might be interesting in vogels.

Thoughts?

createTables should check for errors

createTables.js -> createTables starts with a 'describeTable' request, but never checks if there was an error from that request. This becomes a problem especially when you haven't got proper AWS credentials in which case the application goes zombie "waiting for the table to become ACTIVE" which of course can never happen.

Solution: add if(err) throw err; or similar to
https://github.com/ryanfitz/vogels/blob/master/lib/createTables.js#L14

Get default region from environment?

aws-sdk uses $AWS_ACCESS_KEY and $AWS_SECRET_KEY from the environment if they exist; since vogels uses aws-sdk internally, setting these values also works for it.

I know reading defaults from the environment can be controversial, but would you be open to a PR where vogels also read the region from an environment var called $VOGELS_AWS_REGION or something?

Timestamps as integers

It is possible to save createdAt & updatedAt as integers?

Add promise support

I am considering migrating to DynamoDB for one of my projects, and vogels looks like an excellent option. However, I would really like to see support for using promises for handling callbacks. Node promise library: https://github.com/kriskowal/q

Ability to bind functions to returned objects

It would be very useful to have the ability to bind functions to each object returned from a Model get/scan/query, similar to MongooseJS's (MongoDB ORM) instance methods: http://mongoosejs.com/docs/guide.html#methods

Vogels could allow methods to be added to a Model definition and then bind them to its returned objects using UnderscoreJS's bind method (http://underscorejs.org/#bind).

New JSON Schema type

Have you considered a JSON or Object schema type that is simply a serialized JSON string but which would be automatically stringified or parsed on the way in/out of Dynamo? Would you be open to this functionality as a pull request? Storing a JSON blob in Dynamo is a common pattern I find myself using.

why lodash over lazy.js

@ryanfitz - curious, why did you pick lodash over lazy.js for internals? We really like vogels and would like to start using it in some of our products (and contributing back). However we are still in our evaluation phase on what data mapper to use. In practice we have seen lazy.js be faster in most comparisions. Just curious on your thought process. thanks in advance.

batch write item

upgrade to joi 3

Should we automatically remove null properties when serializing?

Trying to save a null throws a ValidationException from dynamo.

update table

Add support for add/drop indexes from live table

AWS announced a new feature today that allows you to create or drop global secondary indexes on a table without having to delete and recreate the table. Vogels should support this feature in updateTable()

streaming api for batch get items

Passing in region from vogels.AWS.config.update isn't working

I'm not doing anything other than loading in my credentials from a json file:

vogels.AWS.config.loadFromPath('./credentials.json');

This is my credentials file:

{"accessKeyId": "key", "secretAccessKey": "secret", "region": "us-west-1"}

When I try to connect to the db, I'm getting this regardless of what region I put in:

[ConfigError: Missing region in config]

support for dynamodb local ?

This piece is useful for offline development and unit tests. Does vogels support that

Geolocation

Is there a standard way to use geolocation in vogels? If not, what would it take to build such a feature?

Regards,
Debdatta Basu

Allow reconfiguration of DynamoDB endpoint (support for DynamoDB Local)

It would be great to have a way to define the DynamoDB endpoint as I'd like to use DynamoDB local when running tests. An example configuration can be found here:

http://stackoverflow.com/questions/19868938/how-i-can-work-with-amazons-dynamodb-local-in-node

Add ability to limit the count of matched items.

In a scan, limit restricts the number of items scanned. It would be nice if there were a way to limit the amount of matched items a scan returns. Vogels should also probably recurse through each LastEvaluatedKey until the limit is met.

Query also appears to have a similar issue, but only when the query response is getting paginated (such as when it hits the 1MB limit per response). If I do a limit(500) and only get, for example, 100 items returned with a LastEvaluatedKey, Vogels should recurse through and continue to get more responses until the limit is met, or no more results are available.

Right now the only workaround for these is doing a loadAll, which does not respect limit and will always get all available items (which can potentially waste throughput).

last evaluated keys isn't fully returned when querying global secondary indexes

Support for transaction

I assume giving support for dynamodb transactions would be a good value addition to this library. Since most of the application have this requirement.

Allow registration of user-defined schema types

Vogels already has support for internal schema types such as Boolean or UUID. It'd be nice if there was an easy way to define and register additional schema types for project specific use.

Query eating dynamodb errors

It looks like at https://github.com/ryanfitz/vogels/blob/master/lib/utils.js#L82 the util class calls the callback (in this case resultsFunc) without reporting that an error happened. If the error is recoverable, like ProvisionedThroughputExceededException the error should either be reported back to the query.execute method, or the library should try to handle the error through a retry. Right now no error is reported for retryable errors, and the query is not automatically retried.

Happy to submit a pr if you have a direction you would like to go with solving this.

Schema tableName config issue.

var vogels = require('vogels');

vogels.AWS.config.loadFromPath('config/credentials.json');
var Account = vogels.define('Account', function (schema) {
  schema.String('email', {hashKey: true});
  schema.String('name').required(); // name attribute is required
  schema.Number('age'); // age is optional
  schema.Date('created', {default: Date.now});
});
Account.config({tableName: 'Account'});
Account.create({email: '[email protected]', name: 'Foo Bar', age: 21}, function (err, acc) {
  console.trace(err);
});

fails with

Trace: { [ResourceNotFoundException: Requested resource not found]
  message: 'Requested resource not found',
  code: 'ResourceNotFoundException',
  time: Tue Aug 12 2014 23:18:58 GMT+0530 (IST),
  statusCode: 400,
  retryable: false }

Issue goes away on removing

Account.config({tableName: 'Account'});

Empty value on String field is not allowed

When defining a schema like:

var Item = vogels.define('Item', function (schema) {
    schema.UUID('id', {hashKey: true});
    schema.Number('asset_id').required();
    schema.Number('user_id').required();
    schema.String('asset');

Calling

Item.create({id: 1, asset_id: 2, user_id: 3, asset: ''})

fails with

details=[message=the value of asset is not allowed to be empty, path=asset, type=any.empty]

Unprocessed items in batch get item causes AWS error

If Dynamo cannot get all 100 items in a batch get request, vogels replays the UnprocessedKeys. However this object needs to be wrapped back in an object under the key RequestItems. As it stands now a parameter validation error will result in the aws-sdk because it doesn't see a RequestItems key on the request. PR forthcoming.

Add a related model subquery API

As regards subqueries, I have already set up instance methods on my models that will go out and grab related records. I am injecting them automatically by parsing my model pseudo-code as such:

model User:
{
  hashKey: 'id',
  schema: {
    id: vogels.types.uuid(),
    orders: vogels.types.stringSet()
  }
}

model Order:
{
  hashKey: 'id',
  schema: {
    id: vogels.types.uuid(),
    user: Joi.string()
  }
}

In this case I'd store the id attribute from each related order in the Users model instance 'orders' set, then use Order.getItems() to query the orders back out by their keys rather than scanning the whole Order table looking for a user id that matched or building global secondary indexes on that attribute.

Right now when I create my models I inject getters for the related data into Model.prototype. This works fine for lazy loading, but sometimes I want to make a single query and get back related model instances already included in a model instance substructure that lets me interact with them against the other model's API.

So basically, I have lazy loading solved, but I need a solution similar to how you roll up multiple queries into a single response to get around dynamo's max records in a resultset restriction, except synthesizing results from a subquery to another table rather than aggregating results for multiple queries on the same table.

destroy table

Client side/browser support

This looks great! I plan on using DynamoDB a lot on the client side/browser. Are there any plans to support this? Thanks.

update item does not work with expected option

When the expected option is passed to update() it throws Error: Need to pass in a valid Condition Object. From what I can tell the Expected.<key>.Value attribute is missing the data type per the updateItem api docs

If you turn on AWS SDK logging you can see the data going over the wire.

...
AWS.config.update({logger: process.stdout});

var mWsAccessToken = vogels.define('WsAccessToken', {
    tableName:  cfg.dataModelPrefix+'-ws-access-tokens',
    timestamps: true,
    hashKey:    'token',
    rangeKey:   'expires',
    schema:     {
        token:        Joi.string(),
        expires:      Joi.date(),
        tokenVersion: Joi.number(),
        userId:       Joi.string(),
        ip:           Joi.string()
    }
});
...

        return pWrapper.ninvoke(mWsAccessToken,"create",{
        token:        token,
        expires:      expires,
        tokenVersion: currentTokenVersion,
        userId:       userId,
        ip:           optionalIp
    })
        .then(function(oWsAccessToken){
            console.error("DATA", oWsAccessToken);
            console.error("token",oWsAccessToken.get("token"),"expires",oWsAccessToken.get("expires"));

            return pWrapper.ninvoke(mWsAccessToken,"update",{token: oWsAccessToken.get("token"), expires: oWsAccessToken.get("expires"), ip: "5.5.5.5"}, {expected: {userId: "1235"}});
        })
        .then(function(oWsAccessToken){
            console.error("DATA After update", oWsAccessToken);
        })
        .catch(function(e){
            console.error("ERROR", e, e.stack);
        });

The update above produces: Error: Need to pass in a valid Condition Object. With the following data over the wire:

updateItem({ TableName: 'ryan-ws-access-tokens',
 Key: 
  { token: 'MSAxNDI4MDg4NzcxODc5IDEyMzQgMS4yLjMuNA==',
    expires: '2015-04-03T19:19:31.879Z' },
 ReturnValues: 'ALL_NEW',
 UpdateExpression: 'SET #ip = :ip, #updatedAt = :updatedAt',
 ExpressionAttributeValues: { ':ip': '5.5.5.5', ':updatedAt': '2015-03-04T19:19:32.536Z' },
 ExpressionAttributeNames: { '#ip': 'ip', '#updatedAt': 'updatedAt' },
 Expected: { userId: { Value: '1235' } } })

For comparison, If the , {expected: {userId: "1235"}}) is removed from the update call it produces the following over the wire (which succeeds):

updateItem({ TableName: 'ryan-ws-access-tokens',
 Key: 
  { token: { S: 'MSAxNDI4MDg4NTQ4NjE4IDEyMzQgMS4yLjMuNA==' },
    expires: { S: '2015-04-03T19:15:48.618Z' } },
 ReturnValues: 'ALL_NEW',
 UpdateExpression: 'SET #ip = :ip, #updatedAt = :updatedAt',
 ExpressionAttributeValues: 
  { ':ip': { S: '5.5.5.5' },
    ':updatedAt': { S: '2015-03-04T19:15:49.294Z' } },
 ExpressionAttributeNames: { '#ip': 'ip', '#updatedAt': 'updatedAt' } })

documentation

need to creates docs on how to use vogels

getItems destroys input array

which makes it difficult to keep the results in the same order

Optional Schema Attributes

Are optional non-key schema attributes supported? It seems that they're not, and I was wondering if this is by design.

The issue I'm running into is that I if I set {optional: true} for an attribute in the schema, it passes Joi validation but still gets serialized, which means the AWS validation fails because the key exists, but has a value of null/empty string (which is not allowed).