sat-utils / sat-api Goto Github PK

View Code? Open in Web Editor NEW

177.0 177.0 20.0 2.57 MB

One API to search public satellites metadata on AWS

Home Page: https://sat-api.developmentseed.org/search/stac

License: MIT License

JavaScript 99.27% Dockerfile 0.43% Shell 0.30%

sat-api's People

Contributors

Stargazers

Watchers

sat-api's Issues

Stand up dev sat-api using the STAC spec

Deploy a new sat-api utilizing the STAC

This will support both Landsat and Sentinel, but will only be for assets on AWS. In the STAC spec if the assets are available elsewhere (such as google), these will be different records.

All sentinel-2 items acquired after 5 July 2018 are missing from the sat-api catalog

It seems that all sentinel-2 items acquired after 5 July 2018 are missing from the sat-api catalog at https://sat-api/developmentseed.org/search/stac.

This API request:

https://sat-api.developmentseed.org/search/stac?limit=1000&c:id=%22sentinel-2%22&datetime=2018-07-05/2019-01-01

returns no results:

{"type":"FeatureCollection","properties":{"found":0,"limit":1000,"page":1},"features":[]}

while there are many sentinel-2 images acquired after 2018-07-05 that are available on AWS. For instance this one: https://roda.sentinel-hub.com/sentinel-s2-l1c/tiles/30/N/TJ/2018/8/16/0/

Is that expected?

Deployment instructions

Hi,

I'm having some difficulties in following the documentation for deployments using the new structure.
The docs (deploy.md) mention the need to edit .kes/config.yml as in previous sat-api versions. I assumed initially that this was actually the example/.kes/config.yml directory, but this directory does not contain a cloudformation.template.yml file. It seems that the template is now under packages/api/template.

I may prepare a PR with the updated docs for deployment, but need an initial guidance on how to configure the kes environment.

I was able to execute the steps described in the "Building local version" section of the main readme.md, except the last one "yarn watch".

Update README documentation

The documentation in the README needs to be updated to include:

simple deployment instructions...as close to single command deployment as possible, although having an existing bucket will still be required
how to add a new data source via a lambda or step function (e.g. see SNS Sentinel PR as example)

Actual use of the API should be in separate documentation

Move sat-api es and metadata libraries into sat-api-lib

CSV splitter code has been moved into lib/metadata.js, but both the es.js and metadata.js libs in sat-apiare general enough to move into sat-api-lib.

sat-api should contain the minimum amount of code required to create lambdas and deploy the cloudformation template. Lambdas should simply call sat-api-libs except when the code is specific to a data source (i.e. the transform function to transform sensor specific metadata into STAC metadata)

collections links

Links in the collection level metadata for both Landsat and Sentinel is an empty list.

It should be a dictionary and should contain a self link

https://sat-api.developmentseed.org/collections/sentinel-2-l1c/definition
https://sat-api.developmentseed.org/collections/landsat-8-l1/definition

Fully Implement STAC API

Not all supported features of the STAC API spec are supported in sat-api. For example, bbox is supposed to be a basic query, but only the more complex intersects query is supported.

We need to support all parts of the API spec, as indicated here:
https://github.com/radiantearth/stac-spec/tree/master/api-spec

Furthermore, version 0.6.0 is to be released soon so there are some PRs that may affected the API.

Most notably, we need to support

all basic queries
POST support
sorting
the new query language: radiantearth/stac-spec#167
self-documenting endpoint

Implement STAC bbox filter.

Removes the existing intersects filter behavior which accepts a GeoJSON Feature or Geometry in favor of the STAC API bbox parameter. Refs #81.

Align better with WFS3

In stac sprint 2 we decided to align with WFS3 for the dynamic API. It is not required to implement WFS3, the only requirement is the search/stac endpoint, which it does. But it makes some additional decisions that can lead to some confusion.

This issue defines a baseline of fixes to be less confusing with WFS. In another issue I'll lay out what would be needed for WFS compliance.

So the first thing to do is to make /collections/ endpoints. The mechanics for these is in place, it's just a constraint on the cross provider search that sat-api offers. So add in

https://sat-api.developmentseed.org/collections/sentinel-2-l1c/items/ which should operate the exact same as https://sat-api.developmentseed.org/search/stac?cx:id=sentinel-2-l1c

And

https://sat-api.developmentseed.org/collections/landsat-8-l1/items/ should work just like https://sat-api.developmentseed.org/search/stac?cx:id=landsat-8-l1

Any additional new type of data (MODIS, etc) should get its own collections/{name}endpoint.

The other bit to shift is the current /collections/ endpoint. This right now returns a list of all the jsons of a stac-collection, as specified https://github.com/radiantearth/stac-spec/blob/dev/extensions/stac-collection-spec.md These are referred to by the stac Items, and get joined on the fly for the full item output.

As discussed on gitter, these should shift to be 'description' and sit beside the /items/ So:

https://sat-api.developmentseed.org/collections/sentinel-2-l1c/description should return exactly what https://sat-api.developmentseed.org/collections?cx:id=sentinel-2-l1c does now, and landsat would be equivalent.

Those endpoints are the core. If we want a way to list all the descriptions available then could use an endpoint like https://sat-api.developmentseed.org/collections/descriptions

This should be the core for sat-api to be less confusing with WFS. I'll file another issue which should get us quite close to WFS3 compliance - mostly just linking the endpoints together.

add EPSG code field to landsat

landsat-8 does not include the eo:epsg field, as this is not readily available in the index file.

Determine where the EPSG can be determined or calculated from and add to landsat transform lambda.

Landsat Pre-Collection Inventory deprecated

It seems like this API is indexing only the Landsat Pre-Collection Inventory, which is deprecated as of 30th April 2017:

Effective April 30, 2017, newly-acquired Landsat 7 ETM+ and Landsat 8 OLI/TIRS scenes are being processed into the Landsat Collection 1 inventory only; Landsat Pre-Collection inventory is no longer being populated with newly-acquired data.

A query like the following is not returning any results:

https://api.developmentseed.org/satellites/landsat?search=path:196+AND+row:021+AND+acquisitionDate:2017-05-01+TO+2100-01-01&limit=100

{"meta":{"found":0,"name":"sat-api","license":"CC0-1.0","website":"https://api.developmentseed.org/satellites/","page":1,"limit":100},"results":[]}

Even though NASA's EarthExplorer does show data for this query (e.g. 06-MAY-17).

Am I correct here and what can be done about it?

Missing (delay) in Landsat8 product ids

I am running the following to get the latest product ids for the path 143 and row 52:
https://api.developmentseed.org/satellites/?limit=10000&satellite_name=landsat-8&date_from=2018-07-01&date_to=2018-08-22&path=143&row=52

The latest I get is the product id on 25th July 2018, even though the following product exists (10th August 2018), "LC08_L1TP_143052_20180810_20180815_01_T1"

Is there any delay or am I doing something wrong?

Updating resources in kes config does not update stage

Deploying CloudFormation for the first time creates an API gateway with the appropriate resources and the correct endpoints under the stage specified.

However, if the APIGateway resources are changed, it does change the resources, but the endpoints for the stage do not change.

Any idea why this would be @scisco ?

Support for Landsat 8 L2

Hi Matthew,

First of all, thanks for your outstanding work on sat-api and sat-search!

I've been using sat-search to obtain URLs to Landsat 8 scenes. It works like a charm, but it looks like the sat-api does not include support for Level 2 (bottom of atmosphere) scenes, hence searching for oe:platform=landsat-8-l2 does not work. If I'm not mistaken there was support for that with the old API, right?

Do you have plans to incorporate support for L2 on sat-api? If you do, but don't have a lot of time available to implement it right now, could you tell which data sources you were considering to use to retrieve that data?

Document how to make use of results paging

When I submit a query to sat-api, part of the results is a summary of the total number of hits, and what looks like a results paging capability:

properties :{
    "found" : 6811 ,
    "limit" : 1 ,
    "page" : 1
}

I figured out that I can add query parameters to specify the per response limit and page:

https://sat-api.developmentseed.org/search/stac?datetime=2017-08&limit=1&page=2

But I'm not sure what the semantics of page are:

total pages == found
total pages == found/limit

Either way, docs on this would be helpful.

Sentinel 2B

👋 @scisco @matthewhanson
Is there any plan to support Sentinel-2B ?

I could draft a PR this week end if you want !

regards

Consolidate Step Functions in kes

The cloud formation template file specifies the step function, rather than creating them from the kes config. This is largely due to sat-api being created with an early version of kes.

https://github.com/sat-utils/sat-api/blob/develop/.kes/cloudformation.template.yml#L172

The goal here is that users are able to deploy new lambda functions (or step function activities) to perform metadata ingestion of arbitrary sources by adding them to the kes config.yml and not the cloudformation template.

Collection filter is not working

Both return the same results

https://sat-api.developmentseed.org/search/stac?datetime=2017-08&collection=landsat-8

https://sat-api.developmentseed.org/search/stac?datetime=2017-08&collection=sentinel-2

API should provide documentation

API documentation should be provided with the API, rather than a separate branch or set of documentation. This ensures that a single API will be in sync with it's documentation (e.g., if someone makes a change to the API and redeploys the new API will automatically include updated documentation).

upgrade kes to 1.0

Sentinel 2 images for split tiles `sequence`

ref RemotePixel/satellitesearch#5

Sometimes, Sentinel2 images are splited in two directories (sequence).

[sequence] = e.g. 0 - in most cases there will be only one image per day. In case there are more (in northern latitudes), the following images will be 1,2,….

e.g.:

aws s3 ls sentinel-s2-l1c/tiles/38/S/NG/2017/10/9/ --region eu-central-1
                           PRE 0/
                           PRE 1/

sat-api/lambdas/sentinel/index.js

Lines 14 to 16 in 26a73a8

 function getSceneId(date, mgrs, version = 0) { 

 return `S2A_tile_${date.format('YYYYMMDD')}_${mgrs}_${version}`; 

 }

the version is hardcoded with default to 0, and then not updated here

sat-api/lambdas/sentinel/index.js

Line 96 in 26a73a8

record.scene_id = getSceneId(date, mgrs);

also set to 0 here

sat-api/lambdas/sentinel/index.js

Line 48 in 26a73a8

Fix ?

I'm not sure how the database is feeded, so I'm not sure there is an easy fix here ?

cc @drewbo @scisco

Add tests

Add tests using localstack

landsat response includes invalid GeoJSON for data_geometry

In landsat scenes results, the data_geometry includes the crs, which is not part of the GeoJSON spec. Sentinel results data_geometry section does not include the crs.

Landsat lambda function ES client error

Im trying to execute the landsat lambda function manually and am running into the following error:

{
    "errorMessage": "Cannot read property 'indices' of undefined",
    "errorType": "TypeError",
    "stackTrace": [
        "Object.putMapping (/var/task/index.js:364:30)",
        "Object.update (/var/task/index.js:758:13)",
        "satlib.es.client.then.then.then (/var/task/index.js:153750:34)",
        "<anonymous>",
        "process._tickDomainCallback (internal/process/next_tick.js:228:7)"
    ]
}

It seems that the function is being called three times (once for items, once for collections and once with client undefined.)
I added a check to see if client was undefined in the lambda function and it went past the above error, but hit a new one with:

{
    "errorMessage": "client is required",
    "errorType": "AssertionError [ERR_ASSERTION]",
    "stackTrace": [
        "new ElasticsearchWritable (/var/task/index.js:6936:5)",
        "Object.streamToEs (/var/task/index.js:442:20)",
        "processFile (/var/task/index.js:709:13)",
        "processFiles (/var/task/index.js:727:10)",
        "es.putMapping.then (/var/task/index.js:758:17)",
        "<anonymous>",
        "process._tickDomainCallback (internal/process/next_tick.js:228:7)"
    ]
}

This is a fresh deployment of the cloud formation file so oustide of the ingest files lambda function, nothing else has been run.
Any ideas?

Landsat-8 data are missing after 2018-07-03

there may be a problem, I use this API request:

https://sat-api.developmentseed.org/search/stac?datetime=2018-07-03/2018-10-01&c:id=landsat-8

returns results:

{"type":"FeatureCollection","properties":{"found":0,"limit":1000,"page":1},"features":[]}

Consider disabling catch-all paths in API Gateway.

Landsat question

I see that you are getting the list from https://landsat.usgs.gov/landsat/metadata_service/bulk_metadata_files/LANDSAT_8_C1.csv (btw they have gzipped version of it https://landsat.usgs.gov/landsat/metadata_service/bulk_metadata_files/LANDSAT_8_C1.csv.gz)

I believe you do this because it contains polygon coordinates rather than only bbox(min_lat, min_lon, max_lat, max_lon) from AWS and Google. And it contains Pre-Collection and C1 together as bonus.

My question is:
Is it possible to get the list from AWS and project bbox to polygon? Like you do for Sentinel.

Because i ran some tests and there are broken AWS links. And since my idea is to use AWS so that i don't have to download scenes.. I'm wondering if using the AWS list with projection for polygons is possible.. because then i can do correct AOI spatial queries on polygons rather than bbox

I've been researching for a week now and things are all over the place with lists and data :)

Thank you for you work... just wondering if this is possible :)

by polygons i mean (lat,lon) for each of the 4 points of the scene

More full implement WFS3

This is a follow-on to #46 with the additional endpoints that we would need to come close to WFS compliance. It is written in
ranked order of importance. All would be needed to be compliant, but each gets us closer to getting the spirit of the spec

landing page at https://sat-api.developmentseed.org/ The minimal to be compatible would be:

{
  "links": [
    { "href": "https://sat-api.developmentseed.org/",
      "rel": "self", "type": "application/json", "title": "This document" },
    { "href": "https://sat-api.developmentseed.org/collections",
      "rel": "data", "type": "application/json", "title": "Metadata about the feature collections" }
  ]
}

To be fully WFS3 compliant there would also be links to /api and /conformance but I'd say those are lower priorities, and listed below.

/collections/ endpoint should return a json document that lists all the core collections information. For sat-api with landsat and sentinel it would like:

{
  "links": [
    { "href": "https://sat-api.developmentseed.org/collections.json",
      "rel": "self", "type": "application/json", "title": "this document" },
  ],
  "collections": [
    {
      "name": "landsat-8-l1",
      "title": "Landsat 8 Level 1",
      "description": "Landsat 8 is a mission of USGS, etc.",
      "extent": {
        "spatial": [ -180, -90, 180, 90 ],
        "temporal": [ "2013-04-11T00:00:00Z", "2018-06-29T12:11:00Z" ]
      },
      "links": [
        { "href": "https://sat-api.developmentseed.org/collections/landsat-8-l1/items",
          "rel": "item", "type": "application/geo+json",
          "title": "Scenes" },
        { "href": "https://sat-api.developmentseed.org/collections/landsat-8-l1/description",
          "rel": "item", "type": "application/geo+json",
          "title": "Landsat 8 Level 1 Collection Descriptions" }
      ]
    },
    {
      "name": "sentinel-2-l1c",
      "title": "Sentinel 2 Level 1C",
      "description": "Sentinel 2 is a mission of ESA, etc.",
      "extent": {
        "spatial": [ -180, -90, 180, 90 ],
        "temporal": [ "2015-06-29T00:00:00Z", "2018-06-29T12:11:00Z" ]
      },
      "links": [
        { "href": "https://sat-api.developmentseed.org/collections/sentinel-2-l1c/items",
          "rel": "item", "type": "application/geo+json",
          "title": "Granules" },
        { "href": "https://sat-api.developmentseed.org/collections/landsat-8-l1/description",
          "rel": "item", "type": "application/geo+json",
          "title": "Sentinel 2 Level 1C Collection Descriptions" }
    }
  ]
}

Each new dataset, like CBERS or MODIS would add its own.

/collections/{collectionName} endpoint The next step should be pretty easy, it's just instead of the list at /collections/ just put each 'name' at its own endpoint.
/api For this you need to return the openapi spec of the service. If you're not adding new endpoints super often this should be easy enough to do by hand. But of course ideally it's autogenerated. Can start with https://github.com/radiantearth/stac-spec/blob/master/api-spec/WFS3core%2BSTAC.yaml and just shift the endpoint names. It has the /search/stac endpoint and the WFS endpoints.

And in the root document / should add:

    { "href": "http://data.example.org/api",
      "rel": "service", "type": "application/openapi+json;version=3.0", "title": "the API definition" },

It can be ok to do step #5 first, and then this.

/conformance endpoint. This should be pretty easy. Just make a https://sat-api.developmentseed.org/conformance endpoint and add

{
  "conformsTo": [
    "http://www.opengis.net/spec/wfs-1/3.0/req/core",
    "http://www.opengis.net/spec/wfs-1/3.0/req/oas30",
    "http://www.opengis.net/spec/wfs-1/3.0/req/geojson"
  ]
}

If you skipped to this step past #4 then just don't include the oas30 conformance.

And in the root document / should add:

 { "href": "http://data.example.org/conformance",
      "rel": "conformance", "type": "application/json", "title": "WFS 3.0 conformance classes implemented by this server" },

Empty search result for Sentinel-2 from March 2018

Searching for Sentinel-2 images works OK until February 2018. But from March 2018 onwards, the search does not return any results although there are many images from that month available on Amazon AWS:

For example I issued a search request:
https://api.developmentseed.org/satellites/?limit=200&date_from=2018-03-01&date_to=2018-03-22&satellite_name=sentinel-2&utm_zone=33&latitude_band=U&grid_square=VR

The search returns zero results although for tile 33UVR there are images available on AWS for example the image from 20th March 2018:
http://sentinel-s2-l1c.s3.amazonaws.com/tiles/33/U/VR/2018/3/20/0/tileInfo.json

Landsat 8 AWS URL

I think it will be nice to add the AWS URL for Landsat 8 images.

I understand, all landsat data are not in AWS so updating the ES database will require two steps:

getting landsat8 csv from USGS
getting scene list hosted on AWS

refactoring sat-api packages

Now that the initial work in packaging up sat-api as a series of modules has been done thanks to @scisco it's time to start thinking about what the core group of packages should look like.

Right now we have:

api: lambda function for use with API Gateway as the backend, includes all the other packages
api-lib: a collection of modules for ingesting into and searching elasticsearch, and ingesting CSV files
manager: lambda function for running management functions on elasticsearch
ingest: lambda function that fetches a CSV file, splits it up, and calls a specified lambda function on each one to ingest it into elasticsearch
landsat: the landsat ingesting lambda called on each split CSV
sentinel: the sentinel ingesting lambda called on each split CSV

However, ingesting CSV files, while a stopgap for now, is not the long term preferred method of ingesting items. Instead we will want a method to ingest a STAC catalog by providing a URL to any STAC json file. It would then traverse the tree down ingesting all records. Since the catalog is STAC there is no need for sensor specific logic here.

The catalog ingestor can be used to ingest a catalog to populate sat-api with existing data but the question remains how to update it. How do we know what new records to ingest moving forward? Part of that answer is an SNS ingestor. But what happens if the service goes down and some records are missing. We also need a way to backfill sat-api in this case, and I'm not sure how we efficiently pick up the STAC catalog tree again. How can we go to the last ingested record(s) and discover what is new?

So my proposed new packages would be:

elasticsearch

The backend library for using elasticsearch, both adding records (pass in a STAC item), and searching it.

postgis

A postgis-based backend library for adding records and searching

ingest

Ingests a catalog.json file, traversing it and adding records. Run as Lambda (?) or as an ECS task. User selectable backend (e.g., ES or PostGIS).

sns-ingest

Lambda function for ingesting SNS messages. User selectable backend (e.g., ES or PostGIS)

api

Lambda function backend to an API that uses a selectable backend (e.g., ES or PostGIS)

manager (?)

I'm not sure if this this is even needed anymore, it's a matter of convenience, we could opt to have management scripts that can be run locally and manage AWS resources. This may be a bit more transparent.

cc @fredliporace

split: last lines of remoteCsv might not be processed

Hello,

It seems that the the last remoteCSV lines are only processed when its count reaches linesPerFile (10000). That leaves up to 9999 lines not indexed - the most recent imagery in S2A case.

I made the following changes and it seems to be working, still testing to be sure that there is no impact on LS8 indexing - I tested only with S2A indexing up to now. I can include that change in a PR if you want.

rgds.

patch.txt

Provide example of geospatial query string

The README says

No search term is more important however than the a geospatial query to find data covering a specific area. The core STAC spec allows for searching by providing a bounding box, with more complex 'intersects' query to query against user provided polygons. Sat-api does not currently support the [simpler] bounding box query, but does support the 'intersects' query.

doesn't give an example geospatial query like it does for the temporal queries. What would a GET-encoded query look like for, say, a bounding polygon?

Google Cloud index link broken for older Landsat8 scenes (live version of API)

I have noticed that the [google_index] has an incorrect link in some cases:
Example search for November 2013 L8 scenes from path=192, row=25 :
https://api.developmentseed.org/satellites/?limit=200&date_from=2013-11-01&date_to=2013-11-30&satellite_name=landsat-8&path=192&row=025
One scene is found with google_index:
https://console.cloud.google.com/storage/browser/gcp-public-data-landsat/LC08/192/025/LC08_L1GT_192025_20131107_20170428_01_T2
But the correct google_index should be:
https://console.cloud.google.com/storage/browser/gcp-public-data-landsat/LC08/01/192/025/LC08_L1GT_192025_20131107_20170428_01_T2
NOTE the extra /01/ after /LC08/ in the URL.
The other links in the [download_links][google] are correct, only the google_index link is broken.

Parameters: [ArtifactPath, ConfigS3Bucket] must have values

Hi,
I it is the first time I am using cloudformation and I'm not sure what I am doing wrong here.
I get "Parameters: [ArtifactPath, ConfigS3Bucket] must have values" but I'm not sure where I should provide them. Below is the run output.

Thanks,
Tal

kes cf create
.env file is missing
/home/ubuntu/sat-api/.kes/stage.yml was not found. Skipping stage
Template saved to /home/ubuntu/sat-api/.kes/cloudformation.yml
Uploaded: s3://fieldin-sat-api/sat-api-dev/cloudformation.yml
There was an error creating/updating the CF stack
{ ValidationError: Parameters: [ArtifactPath, ConfigS3Bucket] must have values
at Request.extractError (/usr/lib/node_modules/kes/node_modules/aws-sdk/lib/protocol/query.js:47:29)
at Request.callListeners (/usr/lib/node_modules/kes/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at Request.emit (/usr/lib/node_modules/kes/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/usr/lib/node_modules/kes/node_modules/aws-sdk/lib/request.js:683:14)
at Request.transition (/usr/lib/node_modules/kes/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/usr/lib/node_modules/kes/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /usr/lib/node_modules/kes/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request. (/usr/lib/node_modules/kes/node_modules/aws-sdk/lib/request.js:38:9)
at Request. (/usr/lib/node_modules/kes/node_modules/aws-sdk/lib/request.js:685:12)
at Request.callListeners (/usr/lib/node_modules/kes/node_modules/aws-sdk/lib/sequential_executor.js:115:18)
message: 'Parameters: [ArtifactPath, ConfigS3Bucket] must have values',
code: 'ValidationError',
time: 2017-09-28T11:23:57.504Z,
requestId: '844cf423-a43f-11e7-9e04-d593350d1a14',
statusCode: 400,
retryable: false,
retryDelay: 5.052365046069696 }
Parameters: [ArtifactPath, ConfigS3Bucket] must have values

Documentation?

Great API! But how do I call it?

I went through most of the code and files but the only bit I found was this

 * @apiParam {Number} limit=1 Limit of results to return.
 * @apiParam {Number} skip=0 Results to skip in return.
 * @apiParam {Number} page=1 Results page to view.
 * @apiParam {String} fields Comma separated list of fields to include in the query results. Ex: `?fields=scene_id,date,cloud_coverage`.
 * @apiParam {String} contains Comma separated lists of form `longitude,latitude`. Returns results if the point  is within the bounding box of an image. Ex `?contains=40.23,70.76`.
 * @apiParam {String} intersects Valid GeoJSON, returns results that touch any point of the geometry.
 * @apiParam {String} scene_id Performs exact search on sceneID field.
 * @apiParam {Number} cloud_from The lower limit for cloud_coverage field.
 * @apiParam {Number} cloud_to The upper limit for cloud_coverage field.

Can anyone point me to some more information?

Lambda Timeouts on Ingest

I've set up the api following the documentation and new packages layout. I wanted to manually start the ingest process so I was following #52 and starting with Landsat. However, when I run the step function I am hitting the lambda timeout limit of 300 seconds. Has anyone else hit this?

The logs posted in #52 seem to indicate that the upload of files should be < 180 seconds.

I'm deploying in us-west-2 so maybe there is latency issues?

START RequestId: c908ae00-aa34-49c5-af3c-954c3c90dc69 Version: $LATEST
2018-07-28T03:52:16.280Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    ingest event: {"satellite":"landsat","arn":"arn:aws:states:us-west-2:126405998223:stateMachine:LandsatMetadataProcessorStateMachine-n78KcI1tLErq","maxFiles":8,"maxLambdas":10,"linesPerFile":300}
2018-07-28T03:52:27.028Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 250 files
2018-07-28T03:52:47.347Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 500 files
2018-07-28T03:53:08.545Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 750 files
2018-07-28T03:53:27.783Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 1000 files
2018-07-28T03:53:56.126Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 1250 files
2018-07-28T03:54:08.518Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 1500 files
2018-07-28T03:54:28.771Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 1750 files
2018-07-28T03:54:50.549Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 2000 files
2018-07-28T03:55:05.618Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 2250 files
2018-07-28T03:55:20.881Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 2500 files
2018-07-28T03:55:38.163Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 2750 files
2018-07-28T03:56:02.818Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 3000 files
2018-07-28T03:56:30.963Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 3250 files
2018-07-28T03:56:44.597Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 3500 files
2018-07-28T03:56:59.005Z    c908ae00-aa34-49c5-af3c-954c3c90dc69    uploaded 3750 files
END RequestId: c908ae00-aa34-49c5-af3c-954c3c90dc69
REPORT RequestId: c908ae00-aa34-49c5-af3c-954c3c90dc69  Duration: 300082.82 ms  Billed Duration: 300000 ms Memory Size: 1024 MB Max Memory Used: 297 MB
2018-07-28T03:57:16.361Z c908ae00-aa34-49c5-af3c-954c3c90dc69 Task timed out after 300.08 seconds

Any ideas or help is much appreciated!

Backup/restore script for elasticsearch

Need a script to backup the indexes on an elasticsearch cluster, and be able to restore that to another ES cluster. This will enable the data from a development sat-api stack to be promoted to a production stack without requiring the production instance from having to ingest the entire set of data again.

add intersects_precentage_threshold query parameter

This parameter helps with filtering out images that intersects with a given geometry but are relatively small in relation to the image size.

LandsatMetadataProcessorStateMachine not finding index.html

I'm trying to stand up my own instance of the sat-api, and am currently stuck with the lambda not able to find the scene files (?)

START RequestId: 3234ee9a-790a-4e8b-a357-a591c0b17130 Version: $LATEST
2018-07-16T06:02:40.619Z	3234ee9a-790a-4e8b-a357-a591c0b17130
{
    "bucket": "foobar",
    "key": "sat-api-v1-dev/ingest/landsat/",
    "currentFileNum": 4215,
    "lastFileNum": 4215,
    "arn": "arn:aws:states:us-east-1:791757209086:stateMachine:LandsatMetadataProcessorStateMachine-t6P5ZogX0wF0"
}
2018-07-16T06:02:40.747Z	3234ee9a-790a-4e8b-a357-a591c0b17130	connected to elasticsearch
2018-07-16T06:02:40.750Z	3234ee9a-790a-4e8b-a357-a591c0b17130	Processing s3://foobar/sat-api-v1-dev/ingest/landsat/4215.csv
2018-07-16T06:02:41.955Z	3234ee9a-790a-4e8b-a357-a591c0b17130	error processing LC81870162018194LGN00: https://landsat-pds.s3.amazonaws.com/c1/L8/187/016/LC08_L1TP_187016_20180713_20180714_01_RT/index.html not available: 
2018-07-16T06:02:42.298Z	3234ee9a-790a-4e8b-a357-a591c0b17130	error processing LC81870172018194LGN00: https://landsat-pds.s3.amazonaws.com/c1/L8/187/017/LC08_L1TP_187017_20180713_20180714_01_RT/index.html not available: 
2018-07-16T06:02:42.629Z	3234ee9a-790a-4e8b-a357-a591c0b17130	error processing LC81870182018194LGN00: https://landsat-pds.s3.amazonaws.com/c1/L8/187/018/LC08_L1TP_187018_20180713_20180714_01_RT/index.html not available: 
2018-07-16T06:02:42.975Z	3234ee9a-790a-4e8b-a357-a591c0b17130	error processing LC81870192018194LGN00: https://landsat-pds.s3.amazonaws.com/c1/L8/187/019/LC08_L1TP_187019_20180713_20180714_01_RT/index.html not available: 
...

A copy of the file 4215.csv referenced above can be found here:

https://gist.github.com/metasim/8123689f232e0951c3fedfd616c9fc05

Add viewing azimuth field to Landsat

landsat-8 does not include the eo:azimuth field, as this is not readily available in the index file.

Determine where azimuth can be determined or calculated from and add to landsat transform lambda.

Fix autodeploy

datetime filter does not work as specified?

number of landsat8 scene since 2018-01-01 is 684, but 558 since 2017-01-01. What did I do wrong? or there is a bug?

curl "https://sat-api.developmentseed.org/search/stac" -d '{"collection":"landsat-8",  "limit":1, "datetime":"2018-01-01"}' -H 'Content-Type: application/json' -X POST -s|jq ".properties.found"

684

curl "https://sat-api.developmentseed.org/search/stac" -d '{"collection":"landsat-8",  "limit":1, "datetime":"2017-01-01"}' -H 'Content-Type: application/json' -X POST -s|jq ".properties.found"

558

Replace deployed sat-api with v0 API

The new API should replace the existing API with no compatibility changes, and should be specified as v0.

http://api.developmentseed.org/satellites/

Re-enable skipped tests in api-lib.

Improve deployment flexibility

Improving the flexibility of the deployment for the user largely depends on #67 .

We want users to be able to deploy a sat-api and be able to configure:

the backend (e.g., ES or postgis)
ingestion endpoints (root STAC catalogs)
initial ingestion(s) (ECS tasks)
ongoing ingestion(s) (SNS)

cc @scisco

How to manually start ingest process

I've just used kes to create my own instance of sat-api, and am attempting to create an initial baseline catalog. Here's what I've attempted to do to, and am wondering if some advice might be shared.

I see that this process is usually initiated by a cron-type CloudWatch event, which triggers one of the state machine step functions. So I've tried manually starting an execution via the Step Functions console using input such as the following (as gleaned from the cron even input):

{
    "satellite": "landsat",
    "arn": "arn:aws:states:us-east-1:123456:stateMachine:LandsatMetadataProcessorStateMachine-abcdefg",
    "maxFiles": 8,
    "maxLambdas": 10,
    "linesPerFile": 300
}

The step function error I get is:

{
  "error": "Lambda.Unknown",
  "cause": "The cause could not be determined because Lambda did not return an error type."
}

Digging into the CloudWatch logs, I find this:

START RequestId: 938285c1-83b8-41c9-b04c-8143d64b99e1 Version: $LATEST
2018-07-10T15:55:37.200Z	938285c1-83b8-41c9-b04c-8143d64b99e1
{
    "satellite": "landsat",
    "arn": "arn:aws:states:us-east-1:123456:stateMachine:LandsatMetadataProcessorStateMachine-abcdefg",
    "maxFiles": 8,
    "maxLambdas": 10,
    "linesPerFile": 300
}
2018-07-10T15:55:37.305Z	938285c1-83b8-41c9-b04c-8143d64b99e1	connected to elasticsearch
2018-07-10T15:55:37.308Z	938285c1-83b8-41c9-b04c-8143d64b99e1	Processing s3://undefined/undefined0.csv
2018-07-10T15:55:37.319Z	938285c1-83b8-41c9-b04c-8143d64b99e1	MissingRequiredParameter: Missing required key 'Bucket' in params
at ParamValidator.fail (/var/runtime/node_modules/aws-sdk/lib/param_validator.js:50:37)
at ParamValidator.validateStructure (/var/runtime/node_modules/aws-sdk/lib/param_validator.js:61:14)
at ParamValidator.validateMember (/var/runtime/node_modules/aws-sdk/lib/param_validator.js:88:21)
at ParamValidator.validate (/var/runtime/node_modules/aws-sdk/lib/param_validator.js:34:10)
at Request.VALIDATE_PARAMETERS (/var/runtime/node_modules/aws-sdk/lib/event_listeners.js:125:42)
at Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at callNextListener (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:95:12)
at /var/runtime/node_modules/aws-sdk/lib/event_listeners.js:85:9
at finish (/var/runtime/node_modules/aws-sdk/lib/config.js:320:7)
at /var/runtime/node_modules/aws-sdk/lib/config.js:338:9
END RequestId: 938285c1-83b8-41c9-b04c-8143d64b99e1
REPORT RequestId: 938285c1-83b8-41c9-b04c-8143d64b99e1	Duration: 209.98 ms	Billed Duration: 300 ms Memory Size: 1024 MB	Max Memory Used: 101 MB	
RequestId: 938285c1-83b8-41c9-b04c-8143d64b99e1 Process exited before completing request

The s3://undefined/undefined0.csv line stands out to me. Note that my .kes/config.yml file has been updated to something like:

default:
  ...
  buckets:
    internal: my-real-bucket

so I'm not sure where else I'm supposed to put bucket info. And am somewhat stuck as to how to get a test instance running.

Any pointers are appreciated!

Pre-collection LS8 in AWS with distinct archive version from what is being indexed (live version API)

It seems that the results being returned from some Pre-collection LS8 in AWS is using an archive version which is not available (yet?) in AWS.

Take for instance this query:
https://api.developmentseed.org/satellites/?satellite_name=landsat-8&limit=1&path=217&row=076&acquisitionDate=2016-01-15

The AWS- related returned data indicates an archive version 02, for instance:
http://landsat-pds.s3.amazonaws.com/L8/217/076/LC82170762016015LGN02/LC82170762016015LGN02_thumb_large.jpg

The above file is not available. If I manually change to use archive version 00 it works:
http://landsat-pds.s3.amazonaws.com/L8/217/076/LC82170762016015LGN00/LC82170762016015LGN00_thumb_large.jpg

The "Landsat on AWS" browse shows only archive version 00 for the selected scene:
https://landsatonaws.com/L8/217/076/

sort results

Results should default order to newest first.

It would also be good to support an 'order' argument for ascending or descending.

Sorting on other fields could be useful, but really not required, there's only a couple properties for which it might make sense.

Make HTML endpoints for nicer navigation of the service

One of the nicest things about WFS3 is how html is a top level citizen. See like http://geo.weather.gc.ca/geomet-beta/features/?f=html

It'd be nice to do the same with sat-api. The core navigation should be relatively easy, and would be a big win. It'd mostly just be f=html or page.html type links for each of the endpoints.

The other bit is the actual items, which are more challenging. But could hopefully make use of https://github.com/radiantearth/stac-browser code in some way - use the same JS code to generate just the 'page' like http://planet-stac.netlify.com/item/hurricane-harvey/0831/20170831_195552_SS02 from the json.

	function getSceneId(date, mgrs, version = 0) {
	return `S2A_tile_${date.format('YYYYMMDD')}_${mgrs}_${version}`;
	}

sat-utils / sat-api Goto Github PK

sat-api's People

Contributors

Stargazers

Watchers

Forkers

sat-api's Issues

Fix ?

elasticsearch

postgis

ingest

sns-ingest

api

manager (?)

Recommend Projects

Recommend Topics

Recommend Org