Giter VIP home page Giter VIP logo

json_delta's Introduction

-Json Delta service

This exercise is from an actual report on ClanHR. The objective is to build a service that tracks model changes and lists them. The service will receive two user versions, as json:

// old
{"_id": 1,
 "name": "Bruce Norries",
 "address": {"street": "Some street"}}

// new
{"_id": 1,
 "name": "Bruce Willis",
 "address": {"street": "Nakatomi Plaza"}}

Note that these json bags can be big and with several nested objects.

After that, a listing endpoint should be made available that returns a collection of data in the form:

[{"field": "name", "old": "Bruce Norris", "new": "Bruce Willis"},
 {"field": "address.street", "old": "Some Street", "new": "Nakatomi Plaza"}]

Note that the listing should be filtered by start date and end date, and that we only want a change per user/field on that timespan. So for the given events:

March: Name from A to B
March: Name from B to C
June: Name from C to D

If we filter for March, we should get only one change:

[{"field": "name", "old": "A", "new": "C"}]

Wrap-up

Create two endpoints, one to register json model the changes, and another to list the changes based on a start/end date filter.

/add {"id":1,"name":"foo"} 
/add [{"id":1,"name":"bar"},{"id":2,"name":"foobar"}]
  • endpoint to register model changes, receives individual entry or an array of them
    • add timestamp to entry for allow time filtering
    • store json entries in a simple key/value structure, ordered by timestamp
// if no interval is passed filter all
/diff -d {"start":"2017-01-01", end:"2017-01-01"} // all changes from all objects that occurend in between timeinterval
/diff -d {"_id":1,start":"2017-01-01", end:"2017-01-01"} //all changes for obj id 1, in between time interval
  • endpoint to filter changes receiving as paramenter start and end date
    • get specific elements relative to the time query
    • diff in between ordered elements by date
    • output diff in correct format

Assumptions:

  • Will use the field _id, to identify same objects for comparing
  • Will user the field _timestamp, to identify object change, if not present will add to object
  • stateless service, no cache, no data persistence
  • todo
    • [] using a for loop to add multiple items in array creates entries with same timestamp, since js only has resolution to miliseconds, one solution could be add process.hrtime() with micro resolution as padding

Running locally

npm install
npm test 
npm start

or using docker

docker pull darkua/json-delta

Go into scripts dir and you can run ./add.sh to register data, and you cand diff the data running ./diff.sh I add benchmarck script to add a large json file, and diff the contents timer ./benchmark.sh

json_delta's People

Contributors

darkua avatar

Watchers

 avatar  avatar

json_delta's Issues

Parallel parking

Hello,

Somewhere down the line, this service is deployed/delivered via several machines and has a considerable load. It will also need to have its persistence somewhere else. Because of some reasons that you cannot control, you start receiving duplicate requests for the same data. For example:

// request 1
{
  "_id": 1,
  "transactionId": "tx1",
  "name": "Bruce Willis",
  "address": {
    "street": "Nakatomi Plaza"
  }
}

// request 2
{
  "_id": 1,
  "transactionId": "tx1",
  "name": "Bruce Willis",
  "address": {
    "street": "Nakatomi Plaza"
  }

The systems that send these requests, also started sending a transactionId that allows you to detect they are duplicate. For some more reasons, it's very problematic business wise to store duplicate data. You can only store one, even if you receive may. And all those many requests can be delivered at exactly the same time.

Can you elaborate on an approach to this?

Filter by date

Hello,

Thanks for making our challenge! It's always interesting to see new approaches to the problem and to learn from it. :)

Now the team will leave some questions and provide some feedback. Here is my first one. To filter by the dates, you do:

return _.filter(elements, function(item) {
    return item._timestamp >= startDate && item._timestamp < endDate
})

Imagine that elements is a very huge collection. How could you replace this loop for something faster?

What are the implications of having a CPU heavy operation like that on a request? What would happen to the server?

Different databases per environment

Imagine that for some reason you team decide to have different databases per environment. For example, memory db for tests mode and postgres for dev, staging and production. How would you improve your solution to be better prepared for this change? What would be your approach?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.