Giter VIP home page Giter VIP logo

bamboo's People

Contributors

dorey avatar dpapathanasiou avatar iurisilvio avatar larryweya avatar mejymejy avatar modilabs-bumblebee avatar modilabs-starscream avatar myf avatar pld avatar prabhasp avatar rasmuswl avatar rgaudin avatar ukanga avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bamboo's Issues

Add new end point for aggregate level calculations: results in new dataset

Add new end point POST /aggregations/[ID]

request parameters:

  • name is column name that results will be stored in, e.g. avg_num_students
  • formula is the formula used to calculate the field values, e.g. avg(num_students)
  • query is a selector query, e.g. where is_hs=True
  • group is the column to group by, e.g. lga

the results are stored in a new dataset, the aggregation are stored in a aggregation model that includes the above, id of originating dataset, and id of resulting dataset.

aggregations with the same group value for the same originating dataset are store in the same resulting dataset.

start by support a count formula operation

[for the future ratio, min, max, sum, mean, etc]

see comments: #16

cache summary stats after generation

  • on hitting summary stats cache the results
  • maybe in the database
  • maybe with redis or another tool
  • this includes group bys

(in general cache as much as reasonable)

Groups are no longer working

needed so that I can complete static/bar_graphs.html, which is otherwise almost there.

test process:

  • load bar_graphs.html
  • open firebug
  • on the right hand side, you see a select box, change it; the action should be to hit bamboo with a group url
  • see if firebug output includes a json that contains a grouped json summary; doesn't right now.

Add explicit typing to columns

  • explicit detect or set column types
  • store column type with column and return in JSON, etc.
  • in calculations store type from input type
    • maybe use input column types to validate formula

note: currently boolean calculations generate outputs as 1 or 0 integers, in the future type these columns as booleans

Building indicators

  • formula with canonical name for arguments and results
    • e.g. student_teacher_ratio := num_teachers / num_students
  • user defined mapping between their columns names and columns required by formula,
    • num_teachers: my_num_teachers
    • num_students: my_num_girl + my_num_boys
  • or
    • num_teachers: my_num_teachers
    • num_students: my_num_students

Code coverage at 100%

Currently,

---------- coverage: platform linux2, python 2.7.3-final-0 -----------                                    
Name                                  Stmts   Miss  Cover
---------------------------------------------------------
bamboo                                    7      1    86%
celeryconfig                              6      0   100%
config/__init__                           0      0   100%
config/db                                15      0   100%
config/settings                           1      0   100%
controllers/__init__                      0      0   100%
controllers/calculations                 18      0   100%
controllers/datasets                     28      7    75%
controllers/root                          4      1    75%
lib/__init__                              0      0   100%
lib/constants                            13      0   100%
lib/decorators                           13      8    38%
lib/exceptions                            2      0   100%
lib/io                                   22      2    91%
lib/parser                               55      4    93%
lib/summary                              23      8    65%
lib/tasks/__init__                        0      0   100%
lib/tasks/calculator                      9      0   100%
lib/tasks/import_dataset                 12      2    83%
lib/utils                                40      1    98%
models/__init__                           0      0   100%
models/abstract_model                    11      0   100%
models/calculation                       23      2    91%
models/dataset                           19      0   100%
models/observation                       38      6    84%
tests/__init__                            0      0   100%
tests/controllers/__init__                0      0   100%
tests/controllers/test_calculations      21      0   100%
tests/controllers/test_datasets          46      8    83%
tests/decorators                         17      4    76%
tests/lib/test_calculator                30      0   100%
tests/lib/test_parser                    14      0   100%
tests/lib/test_summary                    7      0   100%
tests/models/__init__                     0      0   100%
tests/models/test_calculation            23      0   100%
tests/models/test_dataset                29      0   100%
tests/models/test_observation            40      0   100%
tests/test_base                          21      0   100%
tests/test_server                         4      0   100%
---------------------------------------------------------
TOTAL                                   611     54    91%

store schema for dataset

  • figure out datastructure on bamboo for storing the schema
    • does it just define datatypes?
    • how to add type information to the schema

some sort of message when you group by a numeric type

Right now it just ignores your group by request and gives you the all summary stats back. Do we want to let the end user know? But, maybe this doesn't matter since we'll support numeric group by in the future.

add content to home page

we should have something other than "Ohai World!"... preferably a link to the docs/src/etc. and possibly a description or api reference

Calculate outliers

In summary statistics for numeric types:

  • calculate outliers
  • remove outliers from summary statistics
  • return outlier rows to user

cache calculations

  • should we cache calulations?
    • tied to hash
  • do we also prime them?
    • create on upload of data, asynchronously
    • otherwise you'll get a delay on first pull

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.