urbanccd-uchicago / plenario-client-py Goto Github PK

View Code? Open in Web Editor NEW

3.0 3.0 1.0 141 KB

Official Python Client for the Plenario Datasets API

License: Other

Python 100.00%

open-data plenario python3

plenario-client-py's People

Contributors

Stargazers

Watchers

Forkers

fagan2888

plenario-client-py's Issues

Add an HTTP Header to the Requests

I want to be able to quickly grep logs for client usage on the app. Add a User-Agent header to the requests. Something like plenario-client-py... I dunno. If you've got a better name then use that.

Story #4

Alpha Client

This is the initial stage of building the client. For this, we need:

Classes that represent entities in the response
Classes that represent responses from the API
Basic build of request workflow

Build Paging into Response

Responses are paginated by the server, and the server provides links in the meta object of the response on how to paginate.

Users need to have a clean way to page through the data. meta and data need be dropped down a level into a page object. The parent Response needs to implement and iterator that returns pages, and on each subsequent page sends a request for the next page. The iterator stops when the meta.links.next value is null.

Story #3

Implement GeoJSON/Geometry

Following up on #16, implement this library. Descriptions include bounding boxes that need to be cast to objects, and user created geometries need to be cast to GeoJSON strings for filtering.

Cast Description.bbox to a Polygon
Ensure that all possible geometry types added to filters (F) are properly serialized as strings in the parameters.

Story #4

Create the Meta Response Class

All responses will contain a meta block. This class will handle the information contained in that object.

This is a critical piece of the response and how the users will interact with the API, especially pagination and understanding query structure.

The meta object of the response will look like:

{
  "links": {
    "current": "a self-referential link",
    "previous": "a link to the previous page of records",
    "next": "a link to the next page of records"
  },
  "params": {
    "page_size": "the maximum limit of records in a page",
    "page": "the current page number",
    "order_by": [
      "direction",
      "field"
    ]
  },
  "counts": {
    "total_pages": "the total number of pages of records",
    "total_records": "the total number of records in all the pages",
    "data": "the number of records in the data portion of the response",
    "errors": "the number of errors in the errors portion of the response"
  }
}

links, params and counts must all be available to the user. Within links and counts, all of the example field keys must be present. params will be dynamically built using the response -- it's based on the query params sent in the request.

Story #2

Create the Data Response Class

The vast majority of responses will contain data. What lands in this object is widely subjective and basically impossible to account for ahead of time, given that the data sets are only known when they are accessed and update continuously.

In order to handle the wide variety of response data, we need to think through what is the best way that we can wrap all of this up -- we need to consider the use cases for this data.

Use cases: to me there are two that come to mind -- general, manual exploration (typically via a Jupyter notebook), and automated, large scale data processing (either via Jupyter or some other system).

For the first case, it would make sense to return the data in some sort of named fashion: either a dictionary or namedtuple would work nicely here. I would think the tuples would be best, but we need to take into account that they only work with alphanumeric values for keys -- if a field is named My Field it would need to be snaked into my_field.

For the second case, I would think that more serious analysis would be done using one of the data/scientific libraries (pandas, numpy, etc). In these cases, we should load the data directly into the appropriate container (DataFrame, NDArray, whatever).

Of course, this would involve getting fancy in both how the user wants to parse the response and how we package the client (but that's something I know how to do, and it's not that bad).

Story #2

Implement a TimeRange Class

We need to marshal values from a JSON object to a native object. TimeRanges and their use would be a great addition to providing users with a clean and intuitive interface.

Time ranges consist of lower and upper bounds and lower and upper inclusive booleans. See https://github.com/UrbanCCD-UChicago/plenario2/blob/master/lib/postgres_extensions.ex#L7 for some detailed information about their use in the API.

Specific to this implementation, we need to be able to check for multiple conditions and functionality as implemented by:

Also note that due to the permissive nature of ranges, we need to normalize some values. Here are the assumptions that you can make:

seconds is the least significant value, so always round to that.
All parsed ranges should be normalized to being lower inclusive and upper exclusive.

Regarding lower/upper exclusivity, here's an example:

{
    'lower': '2018-01-01T00:00:00',
    'upper': '2019-01-01T00:00:00',
    'lower_inclusive': true,
    'upper_inclusive': false
}

is equal to

{
    'lower': '2018-01-01T00:00:00',
    'upper': '2018-12-31T23:59:59',
    'lower_inclusive': true,
    'upper_inclusive': true
}

Of course, given the stated assumptions above, we would want to normalize the second object to be the first when parsing the JSON payload.

Story #4

Create Classes to Handle Responses

Responses will always contain meta parts; they could have either data or errors. We nee to be able to parse those different parts of the body.

Epic #1

Parse Datetimes

In several places in the Description class we should be parsing the timestamp strings to proper datetime objects.

Story #4

Find a Good GeoJSON/Geometry Library

This is a research task. Come up with a short list of libs that will correctly handle GeoJSON and cast values into proper Geometries.

Story #4

Create Classes for Response Entity Types

Responses wrap up several types of objects:

Meta
Data
Error

Epic #1

Create the Error Response Class

For requests that error out, we need to provide users with some sort of accessible means to the error information.

In the response body, errors will be a list of strings. Most typically, it will be a single string as the server should halt of the first error it encounters.

I'm open to this either being an object in its own right and deferring error alerting to the response class (built later), or making this a custom Exception that is raised. I'm leaning more to it being an exception, but I can see some, albeit marginally, useful use cases for wrapping it as an object and deferring it being raised later.

Story #2

Create Response Class

Start with the basics on this one.

The response comes back as JSON. It will always have meta information that can be loaded into a Meta object. It will contain either errors which, depending on the path we took with those, will need to either be loaded into an object or raised as an exception, or it will contain data. Data then needs to be marshalled into whatever we built to wrap that.

Story #3

Add a Travis Config

We need to test the library against:

Story #4

Build Out Initial Request Workflow

There are 4 different request types that can be sent to the API:

list
detail
head
describe

Each of these has its own unique set of parameters and endpoint identifiers. They also have common patterns, like pagination and resource base routes.

We need to build out some sort of workflow for accessing data sets via these request types.

Epic #1

Get this out on PyPI

Once all the new parsing and libraries and testing stuff is completed, we need to push this to PyPI.

setup.py
setup.cfg
a whole mess of other boilerplate stuff

@sanilio pair up with me on this so I can show you the best way to get this done.

Story #4

urbanccd-uchicago / plenario-client-py Goto Github PK

plenario-client-py's People

Contributors

Stargazers

Watchers

Forkers

plenario-client-py's Issues

Recommend Projects

Recommend Topics

Recommend Org