Giter VIP home page Giter VIP logo

dwapi-spec's Introduction

data.world โ€” Public API specification

This repository contains the OpenAPI (a.k.a. Swagger) specification for the data.world's API.

Documentation

Reference documentation and code examples for this API can be found at: https://apidocs.data.world/

Contributing

The data.world API specification is an open-source project. Community participation is encouraged. If you'd like to contribute, please follow the Contributing Guidelines.

License

Apache License 2.0

dwapi-spec's People

Contributors

data-koala avatar hardworkingcoder avatar kennyshittu avatar laconc avatar lydiaguarino avatar mariechatfield avatar markmarkoh avatar moses-c-smith avatar nikhilhegde-ddw avatar nirajpatel avatar rebeccaclay avatar rflprr avatar rickliao avatar shad avatar shawnsmith avatar smverleye avatar tigerhe7 avatar tle avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dwapi-spec's Issues

Action Required: Fix Renovate Configuration

There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Error type: Cannot find preset's package (github>datadotworld/renovate-config)

Unable to add description and labels

Currently, when no description or labels have been added any of the files in a dataset, attempting to change description or labels via API fails with HTTP 404.

This seems to be because the user layer doesn't yet exist.

Title should be optional in PUT:/datasets

When allowing title changes in PUT:/datasets we made title a required attribute. With backwards-compatibility in mind, we need a sensible default (existing title) for it in case it's not provided in the request.

At the moment, this breaks some of our existing integrations (incl. Python, R and CKAN).

Problems with sort parameters

sort, when applied to GET:/user/datasets/liked, doesn't seem to have any effect on the response. Meanwhile, if applied to GET:/user/datasets/own or GET:/user/datasets/contributing, results in HTTP 500, no matter what value it carries.

Parameterized Queries

Parameterized Queries

I'd like to propose a spec for how parameterized queries should be processed by the query endpoints - there's two parts to that.

  • How Parameter Values and Types are specified .
  • How Parameter Names and Values are provided on the URL.

Parameter Values and Types

for both SQL and SPARQL, I think we should support "simple", "safe", and "RDF" parameter values.

  • RDF parameters allow for complete precision in the same language the underlying query engine understands, at the cost of a pretty verbose and esoteric syntax. If you want total precision, use this syntax.
  • "safe" parameters allow you to be specific about String/URI parameters by wrapping values in "" or <> - definitely a necessity for user-entered content. If you're building an SDK or integration, this helps make sure that you can be precise about types without the loss of readability in RDF types.
  • "simple" parameters mean that we'll do the right thing with most values, defaulting to String where we can't make a better guess. This maximizes the chance that ad-hoc queries will return results when the user's meaning is clear.

to parse values, here is the algorithm:

try "RDF" parameters:

if value matches /^"(.*)"\^\^<([^<>]*)>$/ :

  "abcdef"^^<http://www.w3.org/2001/XMLSchema#string>
  "3"^^<http://www.w3.org/2001/XMLSchema#integer>
  "4.2"^^<http://www.w3.org/2001/XMLSchema#decimal>
  "true"^^<http://www.w3.org/2001/XMLSchema#boolean>

(matches two "groups" - the string type value and the URI of the type)

"Safe" parameters:

if value matches /^"(.*)"$/ :

  "abcdef"                                 <- String
  "3"                                      <- String
  "4.2"                                    <- String
  "true"                                   <- String
  "https://data.world/"                    <- String

(matches one group - the string value)

if value matches /^<(.*)>$/ :

  <https://data.world/>                    <- URI
  <abcdef>                                 <- URI
  <3>                                      <- URI

(matches one group - the URI)

"Simple" parameters:

if value matches /^([0-9]+)$/ :

  3                                        <- Integer

if value matches /^([0-9]*[.][0-9]+)$/ :

  4.2                                      <- Decimal

if value matches /^(true|false)$/ :

  true                                     <- Boolean

if value matches /^([a-z]+:\/\/.*)$/ :

  https://data.world/                      <- URI

(all of the above match one group - the value to interpret as Integer/Decimal/Boolean/URI)

otherwise :

  abcdef                                   <- String

(just treat the whole value as a String if nothing else matches)

Parameter Names and Values

For SPARQL:

SPARQL supports named parameters, and parameters in queries can be specified either as ?var or $var - it's a very common convention to use ?var for variables that are meant to be matched and $var for variables that are bound to the query execution. Because of that, using the $ syntax as query string parameters is a common way to pass bound variables on a HTTP URL. No reason we shouldn't use that syntax here:

  .../sparql/user/dataset?query=<QUERY>&$var1=<VALUE1>&$var2=<VALUE2>

where and are values according to the spec above

For SQL:

SQL only supports positional parameters. Luckily, HTTP query parameters have a straightforward way to specify an arbitrary length sequence of values for a query parameter - simply repeat the same query parameter name, and multiple instances of that will be treated as a sequence of those values. I'm proposing that we use p for the name of our parameter variable (to keep the URLs nice and short), but could do param or parameter too:

  .../sql/user/dataset?query=<QUERY>&p=<VALUE1>&p=<VALUE2>

where, again, and are values according to the spec above

In both cases (SPARQL and SQL) the way we interpret values is identical. Clearly the values will need to be URL-encoded when actually sent on a URL (as with any value)...

Upgrade implementation project to swagger-maven-plugin 3.1.5

Not all features of Swagger can be used in this project because the actual API implementation leverages swagger-maven-plugin and is limited by what that can do. Version 3.1.5 was just released and supports useful things, like global consumes and produces settings.

Implement DELETE for datasets

Implement DELETE:/datasets in a way that isn't prone to accidental deletes, by either:

  1. Controlling access with a different token scope
  2. Implementing it as a soft delete
  3. Both

Introduce new response type for queries (JSON stream)

Query endpoints should be able to produce a stream of json rows, so that clients could process them more effectively. Ideally, the header (first rows) should be a table schema, complete with column names, types and descriptions.

Improve Oauth documentation

[x] In the Oauth documentation it says "redirect_uri Optional" for the authorization endpoint. But I get error saying client id and redirect_uri are required. I tested the url from the docs https://data.world/oauth/authorize?code=zac4ZV2XbleQ2e&client_id=3MVG9lKcPoNINVB&client_secret=3iQF9BsWEr6nCf&grant_type=authorization_code and get error {"error":"invalid_request","message":"A client_id and redirect_uri are required"}
[x] Say in the documentation how to obtain client_id and client_secret
[x] Change redirect_url to redirect_uri
[ ] Make code samples for various languages
[ ] Explain what to do if html for login is returned to you
[ ] Mention that if theres a hash in either client_id or client_secret that you change the # to %23

Send a custom header & value

Suggestions from @bryonjacob:

  • for every account, generate an OUTBOUND_AUTH_TOKEN - this can be a type 4 UUID. Store it on the agent record
  • show it to the user on /settings/advanced, right next to their API token(s). Give them the ability to reset it, maybe.
  • on every outbound request "webhook" generated by a user, send that value in a custom header.
  • the receiving user can use that to know if the request is really coming from us (and will ignore that header if they don't know what it is)

Remove validation constraints for response object types

Constraints in response object types can lead clients generated with Swagger autogen to fail responses that otherwise were valid at some point on data.world and should continue to be for backwards compatibility sake.

Trusting that the platform will not return invalid responses, validation constraints can be disabled for response objects on the Swagger spec side.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.