Giter VIP home page Giter VIP logo

bienapi's Introduction

bienapi's People

Contributors

dependabot[bot] avatar sckott avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

bienapi's Issues

/plot/protocols problem

Route /plot/protocols. Results return plot_metadata_id as all NAs:

res <- cli$get("plot/protocols")
jsonlite::fromJSON(res$parse("UTF-8"))$data

Omit "plot_metadata_id", assuming this route is shortcut to the following query:

SELECT DISTINCT sampling_protocol
FROM plot_metadata;

API keys

I think it'd be better to have separate API keys for each person - one key for everything seems to me the same as no keys at all - so might as well have a sep. key per person, then would give better sense of usage per person, and can throttle people using it "too" heavily

routes to do

occurrence routes

  • /occurrence/spatial/ Extract occurrence data for specified polygons (WKT) or bounding box (~ BIEN::BIEN_occurrence_spatialpolygons)
  • /occurrence/state/ Extract occurrence data for a state (~ BIEN::BIEN_occurrence_state)
  • /occurrence/county/ Extract occurrence data for a county (~ BIEN::BIEN_occurrence_county)
  • /occurrence/country/ Extract occurrence data for a country (~ BIEN::BIEN_occurrence_country)
  • /occurrence/count/ Count the number of (geoValid) occurrence records for each species in BIEN (~ BIEN::BIEN_occurrence_records_per_species)

species list routes

  • /list/county/ Extract species list by county
  • /list/state/ Extract a species list by state/province
  • /list/spatial/ Extract a list of species within a given WKT

plot routes

  • /plot/country/ Get plot data from specified countries
  • /plot/dataset/ Get plot data by dataset name
  • /plot/datasources/ List available data sources
  • /plot/datasources/<protocol name> Get plot data by data source name
  • /plot/protocols/<protocol name> Get plot data by protocol name
  • /plot/name/<plot name> Get plot data by plot name (~ BIEN::BIEN_plot_name)
  • /plot/state/ Get plot data from specified states/provinces

ranges routes

  • /ranges/species/intersect/ Get range maps that intersect the range of a species (~ BIEN::BIEN_ranges_intersect_species)

trait routes

  • /traits/family/<trait> Extract specific trait data for given families
  • /traits/genus/ Extract all trait data for given genera
  • /traits/genus/<trait> Extract specific trait data for given genera
  • /traits/species/ Extract all trait data for given species
  • /traits/species/<trait> Extract specific trait data for given species
  • /traits/trait/ Extract all measurements for a trait
  • /traits/count/ Count the number of trait observations for each species in the BIEN database

rate limiting

caddy has a rate limit plugin, but not sure that its working, probably not a big deal in the early days of public usagre

potential offset param problem on /list/country route

res <- cli$get("list/country", query = list(country = "Canada", limit=1000))

everything is fine, but I get an error when I try adjusting the offset:

res <- cli$get("list/country", query = list(country = "Canada", offset=10))

The error is a 400, and the message is:

 "PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list\nLINE 4:             AND is_new_world = 1) ORDER BY scrubbed_species_...\n                                                   ^\n: SELECT COUNT(DISTINCT count_column) FROM (SELECT DISTINCT 1 AS count_column FROM \"species_by_political_division\" WHERE (country in ('Canada')\n            AND scrubbed_species_binomial IS NOT NULL\n            AND (is_cultivated = 0 OR is_cultivated IS NULL)\n            AND is_new_world = 1) ORDER BY scrubbed_species_binomial OFFSET $1) subquery_for_count"

/plot route problem

Route /plot. If use fields parameter without including plot_metadata_id,
the latter is returned as all NA. If wish to make plot_metadata_id non-optional,
ensure that it is always populated. E.g., compare

res <- cli$get("plot/metadata?fields= plot_name,country")
jsonlite::fromJSON(res$parse("UTF-8"))$data
res <- cli$get("plot/metadata?fields= plot_metadata_id,plot_name,country")
jsonlite::fromJSON(res$parse("UTF-8"))$data

routes with very long running postgres requests

Some routes

  • /occurrence/species ~ BIEN::BIEN_occurrence_species
  • /occurrence/genus ~ BIEN::BIEN_occurrence_genus
  • /occurrence/family ~ BIEN::BIEN_occurrence_family
  • /occurrence/spatial ~ BIEN::BIEN_occurrence_spatialpolygons
  • /occurrence/count ~ BIEN::BIEN_occurrence_records_per_species

sometimes take a very long time to run - and this isn't just a unicorn server or caddy server thing - have checked that the request is taking a long time on the postgres side of things - looks like there are indices on the table view_full_occurrence_individual so that can't be it i assume

thoughts @ojalaquellueva ?

I can send some eg postgres requests behind the API requests and you can try on your server and see if they're also taking a long time. if they just take a long time and there's no way to speed up, may need to serve these long running requests in a separate sort of async service so as not to bog down the main API


e.g., query that takes a long time:

SELECT scrubbed_species_binomial, latitude, longitude,date_collected,datasource,dataset,dataowner,custodial_institution_codes,collection_code,a.datasource_id     
	FROM (
		SELECT * FROM view_full_occurrence_individual 
		WHERE higher_plant_group IS NOT NULL AND is_geovalid =1 
			AND latitude BETWEEN  27.31 AND 37.29 
			AND longitude BETWEEN  -117.13  AND  -108.62 
	) a
	WHERE st_intersects(ST_GeographyFromText('SRID=4326; POLYGON((-114.125 34.230,-112.346 34.230,-112.346 32.450,-114.125 32.450,-114.125 34.230)) '),a.geom) 
	AND (is_cultivated = 0 OR is_cultivated IS NULL) 
	AND is_new_world = 1  
	AND ( native_status IS NULL OR native_status NOT IN ( 'I', 'Ie' ) ) 
	AND higher_plant_group IS NOT NULL 
	AND (is_geovalid = 1 OR is_geovalid IS NULL) 
	ORDER BY scrubbed_species_binomial;

on my server: This query takes at least 5+ minutes, didn't wait for it to finish
on vegbiendev.nceas.ucsb.edu: takes ~ 1 min 20 sec

Even at the shorter time of the vegbiendev.nceas.ucsb.edu server, that's too long for a normal REST API route - could these longer queries be sped up? Additional indices perhaps? Not sure why the difference in my server and vegbiendev.nceas.ucsb.edu - must be different setups.

/stems problem

/stem returns NA for all analytical_stem_id. This should not be possible if returning individual rows from table analytical_stem. E.g.,

res <- cli$get("stem/species", query = list(species = "Lysimachia quadrifolia", fields="datasource_id, scrubbed_species_binomial, analytical_stem_id, cover_percent"))
jsonlite::fromJSON(res$parse("UTF-8"))$data

fields to return per route

per email conversation:

for each route by default return the fields they have returned in the R pkg, BUT also allow users to select any fields in the allowed set to return, see also #6

dockerize API

working on now - hitting a bit of a road block learning how to dockerize postgres, but will get there โฒ

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.