Giter VIP home page Giter VIP logo

api-in-a-box's Introduction

API-in-a-Box

API-in-a-Box is exactly what it sounds like. Say you have a handful of CSV files that you need a searchable API for. Put those files in Github repository, spin up this API-in-a-Box, and there you go! A REST hypermedia API that utilizes Elasticsearch's killer searching.

Setup

To get started, clone this repo.

Then replace the environment variable ORIGIN_REPO in the docker-compose.yml file to point to the github repo (format: [username]/[repo]) you would like to pull data from. You may use the current repo "switzersc/atlanta-food-data" as an example.

Then, from the root of this directory, build the containers:

docker-compose build

And start them up:

docker-compose up

If you would like to run them in the background, add -d to the up command.

Now you can curl to get your data:

curl http://localhost:4567/resources

If you're running on boot2docker, you will need to replace localhost with your boot2docker ip.

Using this API

You have three endpoints to work with:

  • GET /resources: Lists all resources, paginated by 50 by default. You can change the size of each response by passing in a size query param, and you can offset to get to the next "page" by passing the query param from to identify which result you'd like to start at.
  • GET /resources/:id: Returns a specific resource.
  • GET /resources/search: Search all resources. You can use size and from in the same way as above, but you also have all the queries given in the queries object in the /resources response. These queries match the field names (mappings) of each document, so for example if your CSV has a column called "STREET", each document now has a field with the name "STREET", and you can pass STREET as a query param in a request to this endpoint. You will get results which have a STREET value that includes any word in the value of this query. If you want results matching a whole phrase rather than any word in the phrase, you can pass a query string of match_phrase=true. You can also search by FILE_SOURCE, which is added to every document and contains the name of the file the document was originally a row in.

All responses are returned in the Collection+JSON format, with a MIME type of 'application/vnd.collection+json'

How it works

When you run docker-compose up, the Elasticsearch container starts, and then the API container starts with the api/deploy/start.sh script. This script sleeps for 8 seconds (to give time for the Elasticsearch container to start), then it runs the 'api/data_processor/file_grabber.rb' ruby file, which makes a request to Github's API to download the raw files of whatever is in the repository that you specify with the environment variable ORIGIN_REPO. It also creates an Elasticsearch index with the name api and then adds the rows from the downloaded files as documents to this index and gives them the type resource.

Then the start script spins up a Sinatra server that provides three endpoints above, and returns responses formatted according to the Collection+JSON spec.

NB: Currently, if one of the source files has any incorrect formatting, it will probably raise an exception and the FileGrabber class will skip over this file. Watch the docker logs for the API container to see if there's any note of skipping a file.

To Do

These are ideas for further development:

  • add validations and specs
  • add more cool Elasticsearch features
  • add examples of requests and responses to README
  • add ability to use any remote source, not just a github repo
  • allow for private github repo
  • add ability to use file structure in a repo to define different types of documents for different types of resources

api-in-a-box's People

Contributors

cap10morgan avatar switzersc avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.