Giter VIP home page Giter VIP logo

insight-data-challenge's Introduction

Insight Data Engineering - Coding Challenge

Requirements

I am using Python 2.7.11 for the coding challenge

admins-MacBook-Air:insight-data-challenge admin$ python
Python 2.7.11 (default, Jan 22 2016, 08:29:18) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 

I used the parser.parse method of the python-dateutil module to parse date and time in the raw tweet json and used the SortedListWithKey collection from the sortedcontainers module to maintain an ordered collection of tweets based on their created_at timestamps. Please use pip to install the required python modules by running the command pip install -r requirements.txt

(venv) admins-MacBook-Air:insight-data-challenge admin$ pip install -r requirements.txt 
Collecting python-dateutil==2.5.2 (from -r requirements.txt (line 1))
  Using cached python_dateutil-2.5.2-py2.py3-none-any.whl
Collecting six==1.10.0 (from -r requirements.txt (line 2))
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting sortedcontainers==1.4.4 (from -r requirements.txt (line 3))
Installing collected packages: six, python-dateutil, sortedcontainers
Successfully installed python-dateutil-2.5.2 six-1.10.0 sortedcontainers-1.4.4

Description

The following classes in src/utils/twitter.py help calculate the average degree of a vertex in a Twitter hashtag graph for the last 60 seconds, and update this each time a new tweet appears

  1. Tweet Class - Represents a Tweet with has some attributes defined
  2. TweetWithRawData Class - Derived class of Tweet, which contains the raw text
  3. TwitterNode Class - Represents a Twitter Vertex or Node on the Graph
  4. TwitterNodeGraph Class - Represents a graph with Twitter Vertices or Nodes and Edges between them
  5. TwitterNodeGraphWithSlidingWindow Class - Similar to TwitterNodeGraph but has a sliding window

Testing

I have added tests in insight_testsuite/tests

Additionally, I have unittests defined in src/test/ directory to validate code in src/utils/twitter.py I would have like to clear them up, but didn't have enough time.

insight-data-challenge's People

Contributors

bobbychopra avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.