Giter VIP home page Giter VIP logo

cs9864-realtime-bluemix's People

Contributors

alheure2 avatar bobtheta avatar nava2 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cs9864-realtime-bluemix's Issues

Monitoring service

Users subscribe on "triggers," e.g. Drop by 5%, massive share liquidation.

Two types of triggers is more than enough.

Notifications for a user based on the trigger happening: Make several "available" but since this is PoC, only need one.

Twillio? https://www.twilio.com/

Forwarding Service

Implement a service that forwards data in "packages" of some time frame to clients that register with the service.

I'm undecided about this service. Leaving it unassigned currently.

Finalize Progress Report 1

Finalize the project report 1.

  • Edit for completion
  • Confirm organization -- for missing parts, create new issues
  • Submit

Streaming Service

This will be implemented on CSD Infrastructure.

API

  • query to connect, responds with port to listen on
  • Utilize UDP multicast, the server uses fire-and-forget

Specs information: CSDEmail.pdf

f5:Real-Time Data Processing

Des: The system shall process data in provenance from various sources and merge the data, providing services with real time streaming access to combined data. Therefore the system shall support the combination of multiple data sources in real time

BIG DATA: variety, volume, velocity

Rationale: The combination of data sources is important in order for user services to extract better insights. Enabling the system to combine data sources will provide end user with easier access to the data.

f4:API Access

Des: End-users shall transparently interact with “client services” through a single API entry-point for interaction with “application”.

q4: The API shall provide simultaneous connection to up to xxx users without any performance impediment.

Rationale: Simplicity in the client API lowers the barrier to entry for client application developers to use the service

Business Service Registrar

This is the service that runs the registrar.

All components are up for debate, this is a suggestion. This should utilize the component in #24.

Configuration

It should have the following configurable parameters:

  1. Specification of database parameters -- whether its key-value/cloudant, need to know where there is storage

API

All responses must:

  • Use JSON for response
  • Respond within a reasonable time (20ms?)

GET: list

  • List all services + end points

PUT: register(identifier, endpoint)

Register an end-point with the service

DELETE: remove(identifier)

Remove an end-point

f3:Addition/Removal of Data services

Des: The system shall support data from a variety of sources, a means of connecting to various data sources should provided. Through this interface, administrator could add and remove data sources.

q3: The connection/disconnection to data sources should not affect the availability of the system.

Rationale: Having an easy way of deploying API connection services will make it easier to deploy new services that may utilize the data. Since the purpose of the system is to offer Data as a Service, the addition of new data sources should be facilitated.

Create Project Plan

Using Github's milestone software, create a project plan outlining when parts need to be done.

Services should be broken into sub-issues with a single over-arching issue. "Sub-issues" should follow the 8/80 rule leaning more towards 8h chunks.

All tasks must be aligned with Milestones. I've already "started" milestone 1, which is progress report 1.

  • Report milestones
    1. Progress Report 1 - 11 Mar 2016
    2. Progress Report 2 - 1 Apr 2016
  • Services
    1. Streaming Data -- part of Progress Report 1
    2. Registrar code
    3. Storage scheme
    4. API end-points
    5. News aggregation
    6. Repository cleaning daemon (only keeping 5 (?) days of data)

@brogly could you please comment with expected deliverable dates discussed last Friday.

Implement heartbeat checks for registrar

The client registrar needs a heartbeat to check on its registered services.

Suggested implementation: On each URL, every X seconds, do a HEAD / request and if a non-network error or success is returned, then it's available.

  • 2XX -> Still good!
  • 3XX -> No change, still good
  • 4XX -> Service is still up, the page just does not exist or is unauthed
  • 5XX -> Bad, remove the service

f1:Web service registry

Des:The system shall have a webservice registry interface that enables users to search for and subscribe to various web services.

q1:The availability of this service shall be congruent with the uptime guarantees of Bluemix of 99.95%.

Rationale: Having a centralized list of currently available makes the system more usable, it makes it easier for users to use the system.

Aggregation Service

Using #26, and the libraries #27, #28 and #29 implement a service that:

  1. Receives stock information
  2. For a stock, implement on a time frame, a query against data sources and saving the results for later

I recommend completing #27, #28 OR #29 before working on this. Ignore secondary sources until after this issue is completed.

Configuration

  1. Database parameters

f9:Data storage

Des: Data in provenance from data/processing services shall persist.

q9: Data entering the system shall be stored quickly an effectively within 100ms of processing.

Rationale: Storing the data instead of simply forwarding will allow for historical data and data processing performance improvement.

Client API

Create a client API that has the following routes:

  • /list Lists all client services
  • /:id/:rest This forwards the request to the correct client service if it exists, if it does not exist, it will return 404 or a relevant error

Database Cleaning Service

Service that will clean up data on a timer (e.g. every day?) removing data that is older than x time.

Configuration

  1. Database parameters
  2. Time limit

This can be modified in cloudant.json.

API

Force clean

  • Force the API to clean-up

f6:Real-Time Data Requesting

Des: The system shall provide client services access to real-time, historical, and external feeds upon request. The system shall provide services with data query access and returns data results as large as requested. Clients may ask for assorted variations of data from client services which may require processing

BIG DATA: volume x velocity x variety

q6: The system shall handle sending data responses to query up to 3GB in size. The Data shall be sent at a response speed with latency between 0.5-2 seconds.

Rationale: Larger data sets allow for deeper analysis and better insights and applications, the system shall therefore support sending such large amounts of varied data.

Pull Library: NYT

Investigate and implement querying the NYT for information from a date-range about a stock/equity.

Related #27, #29

Service Registrar Library

This should be "library" code that we can require and reuse within services.

All components are up for debate, this is a suggestion.

Configuration

It should have the following configurable parameters:

  1. Specification of database parameters -- whether its key-value/cloudant, need to know where there is storage
  2. Maximum clients supported

API

All responses must:

  • Use JSON for response
  • Respond within a reasonable time (20ms?)

GET: list

  • List all services + end points

PUT: register(identifier, endpoint)

Register an end-point with the service

DELETE: remove(identifier)

Remove an end-point

f10:Modular Services

Des: Each service shall be built independently of one another. No end-service shall have internal call to other end-service. If services wish to use other services, they shall communicate with the services in a manner similar to end-users.

q10: For maintainability purpose, the dependence upon service shall be explicit and external. By having this dependence clearly indicated, the registry can control the availability of the system effectively.

Rationale: Having little dependence on specific instances of other services gives strong fault tolerance, scalability, and throughput when the system is balanced.

Pull Library: Guardian

Investigate and implement querying the Guardian for information from a date-range about a stock/equity.

Related: #28, #27

Pull Library: Webhose.io

Investigate and implement a set of library code to query against webhose.io for information from a date-range about a stock/equity.

Related: #28, #29

Stock Server: All connection errors are treated as catastrophic

Currently, the stock server kills an endpoint at any error. This is not correct, Bluemix sometimes gives 503 errors that are fixed by resending.

Tasks:

  • Implement resending on errors that are relevant
  • If an error is bad enough (i.e. ECONNREFUSED) kill the EndPoint, never the server
  • If an error is minor, maintain a state of errors, if threshold errors happen in a row, kill the end point

f7: Real-Time data forwarding

Des: The system will support sending continuous data streams so that End-user clients may receive continuous data from client services. The system shall support the streaming of data in provenance of one or more data service.

BIG DATA: velocity, volume, variety

q7: The system shall use a stream-processing engine with a latency of 0.5 – 2.0 seconds to process data in real-time data. . It is necessary to make sure our system runs quickly, a sluggish system may have a negative effect on a financial domain or client satisfaction. So, a response time should be within 20ms.

Rationale: Services may be interested in plotting data in real time or having some real time analytical goal, therefore having the ability of sending continuous data without the need for constant request is important.

Create SOA Presentation

Requirement

Team to do a short presentation on SOA and hosting services on a cloud

Documents

  • SOA - Exerpt.doc

f2: Addition / Removal of Web services

Des: The system shall have an administrator interface to enable real time addition/removal of services. The system shall therefore maintain an up to date view of currently available services.

q2:The deployment and removal of client services should not cause any service interruption.

Rationale: Having the ability to change which services are available is very helpful. If a service is misbehaving, this will enable administrator to handle the issue without disruption.

Client Services Registrar

This is the service that registers client services. This will be very similar to #25.

All components are up for debate, this is a suggestion. This should utilize the component in #24.

Configuration

It should have the following configurable parameters:

  1. Specification of database parameters -- whether its key-value/cloudant, need to know where there is storage

API

All responses must:

  • Use JSON for response
  • Respond within a reasonable time (20ms?)

GET: list

  • List all services + end points

PUT: register(identifier, endpoint)

Register an end-point with the service

DELETE: remove(identifier)

Remove an end-point

Create bluemix demo

  • Create a demo presentation for bluemix
  • Demo workflow to create an app and show it working
    1. Generate via express
    2. Deploy via cf
    3. Verify "Hello world" works on bluemix services

Stock Data Stream Handler Service

This will read from the service implemented in #23. It will have a registry of end points to cast against.

Notes to consider:

  • Should Clients connect via a socket or via POST HTTP calls
  • If using UDP/multicast, how do you maintain a common request end point?
  • Who does the ultimate sending from infrastructure to the client?
  • Is data forwarded immediately or grouped?

Configuration

Implementation specific?

API

All responses must:

  • Use JSON for response
  • Respond within a reasonable time (20ms?)

f8:Push Notification

Des: The system shall support sending push notifications to end-user upon subscription. The user shall support the setup of trigger and have the ability to monitor those triggering event.

q8: Notifications shall be sent within a processing time frame of less than 1 second

Rationale: The system shall have the ability to send notification to the user when an event of interest is detected. Having the ability to communicate with the user without the user initiating communication is helpful in maintaining performance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.