ryanfitz / vogels Goto Github PK
View Code? Open in Web Editor NEWDynamoDB data mapper for node.js
License: Other
DynamoDB data mapper for node.js
License: Other
I noticed that Item.save is not setting the updatedAt column for items with a schema of timestamps = true.
Although DynamoDB does not directly support attributes of type Object, it would be nice if vogels supported schema types for Object & ObjectSet where vogels would serialize/deserialize the object's JSON representation to/from a DynamoDB string attribute type.
I'm using this module and it generates noise in our logs when it creates a table. I'd like to have the ability to disable this logging. Would you be open to a pull request that uses the debug module for enabling logging based on environment variable settings?
vogels v2.0 introduces a new api for defining models. The new define api brings full JSON document support to vogels, including support for deeply nested documents. DynamoDB Expression apis are exposed as options to all model methods. In order to support JSON documents the update api call has been switched to using UpdateExpressions from the deprecated AttributeUpdates api. The riskiest part of migrating to this version will be testing out your update operations.
The api for defining a model has been changed to accept a block of json configuration.
var Joi = require('joi');
var Account = vogels.define('Account', {
hashKey : 'email',
// add the timestamp attributes (updatedAt, createdAt)
timestamps : true,
schema : {
email : Joi.string().email(),
name : Joi.string(),
age : Joi.number(),
roles : vogels.types.stringSet(),
settings : {
nickname : Joi.string(),
acceptedTerms : Joi.boolean().default(false)
}
}
});
Both local and global secondary indexes are defined when defining the model.
var BlogPost = vogels.define('Account', {
hashkey : 'email',
rangekey : 'title',
schema : {
email : Joi.string().email(),
title : Joi.string(),
content : Joi.binary(),
PublishedDateTime : Joi.date()
},
indexes : [
{ hashkey : 'email', rangekey : 'PublishedDateTime', type : 'local', name : 'PublishedIndex' },
{ hashkey : 'title', rangekey : 'email', type : 'global', name : 'TitleEmailIndex' },
]
});
What is the status regarding the IN operator?
The IN operator is documented on the main page in the scan section, but apparently the corresponding test is still disabled.
I have tried to use the IN operator based on the example code, but it raises a ValidationException.
Executing this:
'use strict';
var AWS = require('aws-sdk');
var vogels = require('vogels');
var Joi = require('joi');
AWS.config.update({
accessKeyId: 'accessKeyId',
secretAccessKey: 'secretAccessKey',
region: 'region'
});
vogels.dynamoDriver(new AWS.DynamoDB({ endpoint: 'http://localhost:8000' }));
var Account = vogels.define('Account', {
hashKey : 'email',
// add the timestamp attributes (updatedAt, createdAt)
timestamps : true,
schema : {
email : Joi.string().email(),
name : Joi.string(),
age : Joi.number(),
roles : vogels.types.stringSet(),
settings : {
nickname : Joi.string(),
acceptedTerms : Joi.boolean().default(false)
}
}
});
Account.config({tableName: 'AccountsTable'});
vogels.createTables({
'example-Account' : {readCapacity: 1, writeCapacity: 10},
}, function (err) {
if(err) {
console.log('Error creating tables', err);
process.exit(1);
}
Account.create({email: '[email protected]', name: 'Foo Bar', age: 21}, function (err, acc) {
console.log('created account in DynamoDB', acc.get('email'));
Account
.scan()
.where('name').in(['[email protected]', '[email protected]'])
.exec(function (err, accounts) {
console.log('Error:', err);
console.log('Accounts:', JSON.stringify(accounts));
});
});
});
Gives this output:
created account in DynamoDB [email protected]
Error: { [ValidationException: One or more parameter values were invalid: ComparisonOperator IN is not valid for L AttributeValue type]
message: 'One or more parameter values were invalid: ComparisonOperator IN is not valid for L AttributeValue type',
code: 'ValidationException',
time: Wed Mar 18 2015 16:24:17 GMT+0100 (W. Europe Standard Time),
statusCode: 400,
retryable: false,
retryDelay: 0 }
Accounts: undefined
Should the IN operator be removed from the documentation if it is not possible to use it?
When I npm install vogels (2.0.0-rc9), the package.json under node_modules/vogels has this dependancy:
"dynamodb-doc": "git+https://github.com/awslabs/dynamodb-document-js-sdk"
It requires git and causes problems when deploying to Elastic Beanstalk on AWS.
The current version used is extremely out of date. This is however, potentially a major breaking change
It would be great if the schema also allowed for definition of local secondary index attribute projection, in a similar way to the global index projections.
How might I configure this package to use a locally running DynamoDB (based on http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Tools.DynamoDBLocal.html)?
The Table.create
method uses joi to check that all of the supplied data is valid before attempting to serialize it and write it to the database (table.js line 116). However, Table.update
does not do any data validation. This can result in invalid data being saved to the db. It can also cause a crash since some of the serializer code assumes that it is working with valid data. (For example, if the user performs an update with an invalid Date, a later call to Date.toISOString
will raise an uncaught error.)
Currently, backing up data in DynamoDB is very challenging. I want an export feature to run a parallel scan and stream the json data to a writeable stream.
It looks like vogels implements retry logic if a request fails, but it seems to just hammer the dynamoDB api until it either fails or gets a success. According to amazon something like an exponential backoff should be followed when getting an error that is retry-able.
References:
Vogel Retry Logic
Amazon Exponential Backoff
Thoughts?
in DynamoDB's putItem you pass
{
"Expected" : {
"someKey" : {
"Exists": false
}
}
}
BatchGetItem right now only supports doing multiple GetItem calls on a single table. It would be nice if it also supported doing GetItem calls on multiple tables as well.
Not sure how the design of this should be, though, since all operations are functions inside a specific table object.
What's your thoughts on adding something similar to what mongoose has that only saves input that is a part of the schema? I need to sanitize input before saving the model using vogels. I thought about using something like this using lodash, which would work, but it would be nice if this was built into vogels.
var _ = require('lodash');
// in a user create express route...
var userData = _.pick(req.body, 'email', 'name', 'username', 'title');
User.create(userData, function (err, user) {
console.log('created account in DynamoDB', user.get('email'));
});
This would only save the email, name, username, and title to the user. Even if something like admin: true
was in the body.
I know mongoosejs has a strict implementation that might be interesting in vogels.
Thoughts?
createTables.js -> createTables starts with a 'describeTable' request, but never checks if there was an error from that request. This becomes a problem especially when you haven't got proper AWS credentials in which case the application goes zombie "waiting for the table to become ACTIVE" which of course can never happen.
Solution: add if(err) throw err; or similar to
https://github.com/ryanfitz/vogels/blob/master/lib/createTables.js#L14
aws-sdk
uses $AWS_ACCESS_KEY
and $AWS_SECRET_KEY
from the environment if they exist; since vogels
uses aws-sdk
internally, setting these values also works for it.
I know reading defaults from the environment can be controversial, but would you be open to a PR where vogels also read the region from an environment var called $VOGELS_AWS_REGION
or something?
It is possible to save createdAt & updatedAt as integers?
I am considering migrating to DynamoDB for one of my projects, and vogels looks like an excellent option. However, I would really like to see support for using promises for handling callbacks. Node promise library: https://github.com/kriskowal/q
It would be very useful to have the ability to bind functions to each object returned from a Model get/scan/query, similar to MongooseJS's (MongoDB ORM) instance methods: http://mongoosejs.com/docs/guide.html#methods
Vogels could allow methods to be added to a Model definition and then bind them to its returned objects using UnderscoreJS's bind method (http://underscorejs.org/#bind).
Have you considered a JSON or Object schema type that is simply a serialized JSON string but which would be automatically stringified or parsed on the way in/out of Dynamo? Would you be open to this functionality as a pull request? Storing a JSON blob in Dynamo is a common pattern I find myself using.
@ryanfitz - curious, why did you pick lodash over lazy.js for internals? We really like vogels and would like to start using it in some of our products (and contributing back). However we are still in our evaluation phase on what data mapper to use. In practice we have seen lazy.js be faster in most comparisions. Just curious on your thought process. thanks in advance.
Trying to save a null throws a ValidationException from dynamo.
AWS announced a new feature today that allows you to create or drop global secondary indexes on a table without having to delete and recreate the table. Vogels should support this feature in updateTable()
I'm not doing anything other than loading in my credentials from a json file:
vogels.AWS.config.loadFromPath('./credentials.json');
This is my credentials file:
{"accessKeyId": "key", "secretAccessKey": "secret", "region": "us-west-1"}
When I try to connect to the db, I'm getting this regardless of what region I put in:
[ConfigError: Missing region in config]
This piece is useful for offline development and unit tests. Does vogels support that
Is there a standard way to use geolocation in vogels? If not, what would it take to build such a feature?
Regards,
Debdatta Basu
It would be great to have a way to define the DynamoDB endpoint as I'd like to use DynamoDB local when running tests. An example configuration can be found here:
http://stackoverflow.com/questions/19868938/how-i-can-work-with-amazons-dynamodb-local-in-node
In a scan, limit restricts the number of items scanned. It would be nice if there were a way to limit the amount of matched items a scan returns. Vogels should also probably recurse through each LastEvaluatedKey until the limit is met.
Query also appears to have a similar issue, but only when the query response is getting paginated (such as when it hits the 1MB limit per response). If I do a limit(500) and only get, for example, 100 items returned with a LastEvaluatedKey, Vogels should recurse through and continue to get more responses until the limit is met, or no more results are available.
Right now the only workaround for these is doing a loadAll, which does not respect limit and will always get all available items (which can potentially waste throughput).
I assume giving support for dynamodb transactions would be a good value addition to this library. Since most of the application have this requirement.
Vogels already has support for internal schema types such as Boolean or UUID. It'd be nice if there was an easy way to define and register additional schema types for project specific use.
It looks like at https://github.com/ryanfitz/vogels/blob/master/lib/utils.js#L82 the util class calls the callback (in this case resultsFunc) without reporting that an error happened. If the error is recoverable, like ProvisionedThroughputExceededException the error should either be reported back to the query.execute method, or the library should try to handle the error through a retry. Right now no error is reported for retryable errors, and the query is not automatically retried.
Happy to submit a pr if you have a direction you would like to go with solving this.
var vogels = require('vogels');
vogels.AWS.config.loadFromPath('config/credentials.json');
var Account = vogels.define('Account', function (schema) {
schema.String('email', {hashKey: true});
schema.String('name').required(); // name attribute is required
schema.Number('age'); // age is optional
schema.Date('created', {default: Date.now});
});
Account.config({tableName: 'Account'});
Account.create({email: '[email protected]', name: 'Foo Bar', age: 21}, function (err, acc) {
console.trace(err);
});
fails with
Trace: { [ResourceNotFoundException: Requested resource not found]
message: 'Requested resource not found',
code: 'ResourceNotFoundException',
time: Tue Aug 12 2014 23:18:58 GMT+0530 (IST),
statusCode: 400,
retryable: false }
Issue goes away on removing
Account.config({tableName: 'Account'});
When defining a schema like:
var Item = vogels.define('Item', function (schema) { schema.UUID('id', {hashKey: true}); schema.Number('asset_id').required(); schema.Number('user_id').required(); schema.String('asset');
Calling
Item.create({id: 1, asset_id: 2, user_id: 3, asset: ''})
fails with
details=[message=the value of asset is not allowed to be empty, path=asset, type=any.empty]
If Dynamo cannot get all 100 items in a batch get request, vogels replays the UnprocessedKeys. However this object needs to be wrapped back in an object under the key RequestItems
. As it stands now a parameter validation error will result in the aws-sdk because it doesn't see a RequestItems
key on the request. PR forthcoming.
As regards subqueries, I have already set up instance methods on my models that will go out and grab related records. I am injecting them automatically by parsing my model pseudo-code as such:
model User:
{
hashKey: 'id',
schema: {
id: vogels.types.uuid(),
orders: vogels.types.stringSet()
}
}
model Order:
{
hashKey: 'id',
schema: {
id: vogels.types.uuid(),
user: Joi.string()
}
}
In this case I'd store the id attribute from each related order in the Users model instance 'orders' set, then use Order.getItems() to query the orders back out by their keys rather than scanning the whole Order table looking for a user id that matched or building global secondary indexes on that attribute.
Right now when I create my models I inject getters for the related data into Model.prototype. This works fine for lazy loading, but sometimes I want to make a single query and get back related model instances already included in a model instance substructure that lets me interact with them against the other model's API.
So basically, I have lazy loading solved, but I need a solution similar to how you roll up multiple queries into a single response to get around dynamo's max records in a resultset restriction, except synthesizing results from a subquery to another table rather than aggregating results for multiple queries on the same table.
This looks great! I plan on using DynamoDB a lot on the client side/browser. Are there any plans to support this? Thanks.
When the expected
option is passed to update()
it throws Error: Need to pass in a valid Condition Object.
From what I can tell the Expected.<key>.Value
attribute is missing the data type per the updateItem api docs
If you turn on AWS SDK logging you can see the data going over the wire.
...
AWS.config.update({logger: process.stdout});
var mWsAccessToken = vogels.define('WsAccessToken', {
tableName: cfg.dataModelPrefix+'-ws-access-tokens',
timestamps: true,
hashKey: 'token',
rangeKey: 'expires',
schema: {
token: Joi.string(),
expires: Joi.date(),
tokenVersion: Joi.number(),
userId: Joi.string(),
ip: Joi.string()
}
});
...
return pWrapper.ninvoke(mWsAccessToken,"create",{
token: token,
expires: expires,
tokenVersion: currentTokenVersion,
userId: userId,
ip: optionalIp
})
.then(function(oWsAccessToken){
console.error("DATA", oWsAccessToken);
console.error("token",oWsAccessToken.get("token"),"expires",oWsAccessToken.get("expires"));
return pWrapper.ninvoke(mWsAccessToken,"update",{token: oWsAccessToken.get("token"), expires: oWsAccessToken.get("expires"), ip: "5.5.5.5"}, {expected: {userId: "1235"}});
})
.then(function(oWsAccessToken){
console.error("DATA After update", oWsAccessToken);
})
.catch(function(e){
console.error("ERROR", e, e.stack);
});
The update above produces: Error: Need to pass in a valid Condition Object.
With the following data over the wire:
updateItem({ TableName: 'ryan-ws-access-tokens',
Key:
{ token: 'MSAxNDI4MDg4NzcxODc5IDEyMzQgMS4yLjMuNA==',
expires: '2015-04-03T19:19:31.879Z' },
ReturnValues: 'ALL_NEW',
UpdateExpression: 'SET #ip = :ip, #updatedAt = :updatedAt',
ExpressionAttributeValues: { ':ip': '5.5.5.5', ':updatedAt': '2015-03-04T19:19:32.536Z' },
ExpressionAttributeNames: { '#ip': 'ip', '#updatedAt': 'updatedAt' },
Expected: { userId: { Value: '1235' } } })
For comparison, If the , {expected: {userId: "1235"}})
is removed from the update call it produces the following over the wire (which succeeds):
updateItem({ TableName: 'ryan-ws-access-tokens',
Key:
{ token: { S: 'MSAxNDI4MDg4NTQ4NjE4IDEyMzQgMS4yLjMuNA==' },
expires: { S: '2015-04-03T19:15:48.618Z' } },
ReturnValues: 'ALL_NEW',
UpdateExpression: 'SET #ip = :ip, #updatedAt = :updatedAt',
ExpressionAttributeValues:
{ ':ip': { S: '5.5.5.5' },
':updatedAt': { S: '2015-03-04T19:15:49.294Z' } },
ExpressionAttributeNames: { '#ip': 'ip', '#updatedAt': 'updatedAt' } })
need to creates docs on how to use vogels
which makes it difficult to keep the results in the same order
Are optional non-key schema attributes supported? It seems that they're not, and I was wondering if this is by design.
The issue I'm running into is that I if I set {optional: true} for an attribute in the schema, it passes Joi validation but still gets serialized, which means the AWS validation fails because the key exists, but has a value of null/empty string (which is not allowed).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.