Giter VIP home page Giter VIP logo

gsip's People

Contributors

dblodgett-usgs avatar denevers avatar dependabot[bot] avatar heryk avatar jvanulde avatar rtm-nrcan avatar tomkralidis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gsip's Issues

Misleading header attribute in data configuration

In data configuration (/data/), an attribute is used to override the media type (if the requested media type is not exactly what is being returned). Except that attribute is call "header", which is misleading because it give the impression any http header could be overridden. Should be "media-type" or "alt-media-type".

Provide more cross-border relations

In order to demonstrate the harvesting of linkages, there is an immediate need for more relations. Currently, there is one relation encoded in the harvester:

<https://cida-test.er.usgs.gov/chyld-pilot/id/hu/041504081604>
<https://www.opengis.net/def/hy_features/ontology/hyf/lowerCatchment>
    <https://geoconnex.ca/id/catchment/02OJ*CA>.

HY_Features Relations

Wondering what the best path is going to be to get HY_Features relations implemented in GSIP. I'd be happy to contribute if @tomkralidis or @denevers can point me to the RTFM and / or the files I can hack on.

Add multiple representation of info

Depending of the usage, multiple /info/ representations might be necessary. The challenge and how to negotiate the a particular info for a given /id/. Since /id/ there should be only one /id/ per real world feature in a given governance, the resolution to multiple /info/ cannot rely on the /id/ structure.

One can rely only on

  • specialised format in the content negotiation (or subtype of formats, such as text/html; subtype=gin
  • profile negotiation (in http header)
  • parameters (profile=gin)

or a combination of those options.

replace clunky format override with a cleaner filter

GSIP content negotiation can override the Accept header by passing a f= parameter using a list of well known format key (stored in the configuration). right now, it's clunky code in each handler that tests this override before invoking the code.

A cleaner way to do this is to implement a PreMarcher change the Accept header

import java.io.IOException;
import java.util.ArrayList;

import javax.ws.rs.container.ContainerRequestContext;
import javax.ws.rs.container.ContainerRequestFilter;
import javax.ws.rs.container.PreMatching;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.ext.Provider;

@Provider
@PreMatching
public class FormatRequestFilter implements ContainerRequestFilter {

	@Override
	public void filter(ContainerRequestContext requestContext) throws IOException {
		String format = requestContext.getUriInfo().getQueryParameters().getFirst("f");
		ArrayList<String> mediaTypes = new ArrayList<>();
		if ("application/json".equals(format) || "json".equals(format))
			mediaTypes.add(MediaType.APPLICATION_JSON);
		if ("text/html".equals(format) || "html".equals(format))
			mediaTypes.add(MediaType.TEXT_HTML);
		if (mediaTypes.size() > 0)
			requestContext.getHeaders().put("Accept", mediaTypes);
		
	}

}

this will override the header before routing the call to the right handler

Review AWS Offerings

Assess technologies that can be leveraged on AWS.

  • Blazegraph
  • Neo4J
  • Neptune (Blazegraph on AWS)
  • Dgraph

Linking a non LOD app (CHyF) with GSIP

Currently there is a gap between CHyF pilote-project and GSIP when one wants to find the closest hydrometric station to a point or in a catchment for example. One way is to fetch all the watersheds, get the geometry, find which one intersects our point and look for a relation with an hydrometric station.

Some alternatives are available and could help in simplifying the linkage between both application.

  1. Add a geometric component on each item so that it can be searched via a GeoSPARQL query.
  2. Load the HUC 12 watersheds in the CHyF model. Its ID would be used to access the LOD API and find the information needed.
  3. Build an API on top of GSIP that takes a geometry and returns the ID of the watershed(s) that intersects it.
  4. ...

We have to find out which solution is best suited for our needs.

Instead of coding user/password in ftp url, system should ask the user

In one case, we have a representation that goes to a private ftp server. This u/p is coded in the ftp url (ftp://user:password@,,,). This means that this information is now visible publically on the web (because the proxy pulls the data from the ftp). Probably a bad idea, if it's password protected, the user should have to provide credential (if it has them) when redirected to the ftp site.

Add USGS association in CAN github

Now that we have more links thanks to Dave, should we add them in our own ?
Or should we use this opportunity to show that nodes does not necessary have symmetric content ?

Can we have label over geojson representation on the map

@heryk : When we click on something on the map ( a catchment), we get a bubble showing relations to other features and some might have multiple pushpin for geojson representation. Is it possible to have a label for each of the pushping when the mouse hovers above it

Deal with transient features

Some features in CHyF are transient - they only exist in a particular version of the database. This is the case for small streams that may or may not exist to complete the network (?) or move and split in different segments. Assigning a persistent URI to those feature is an issue because it might not exists in a future iteration. But we still need them because they might be referred to (my a monitoring station, or by another segment).

Maybe one way to address this would be to use HTTP 301 (moved permanently) to tell client that the segment (the real feature) does not exist anymore (this means the system will need to keep track of them for a certain time).

Another solution is that we don't create Non-Information Resource (NIR) for transient features and only allow access to non transient representation through a persistent one. This means we need another kind of NIR -> IR link . The only one we support right now is "seeAlso", linking a NIR to a IR.

Add a hook function to convert a MIR into a NIR instead of relying on internal logic

Currently, when a MIR (/info/) is requested from GSIP, it can "know" what NIR we are referring to by replacing /info/ with /id/. This assumption is embedded in GSIP architecture. This prevents GSIP MIR template to be used used with arbitrary PID. A use case proposed by @dblodgett-usgs is to have the NIR managed by a PID that would redirect to GSIP to do the MIR job (pull data from the storage and build the page). Right now it's not possible because GSIP will always assume that the MIR is a representation of a NIR built by replacing /info/ with /id/.

This logic is embedded in Information.java

public  String getIdResource(UriInfo uriInfo)
	{
		// get the /id/ resource and convert to persistant
		String persistentUri = Manager.getInstance().getConfiguration().getParameterAsString(PERSISTENT_URI, BASE_URI);
		// this one is called by /info/, but the real NIR is /id/
		StringBuilder infoUri = new StringBuilder(persistentUri + "/id");
		boolean first = true;
		for(PathSegment segment: uriInfo.getPathSegments())
		{
			if (first) {first = false; continue;}
			infoUri.append("/"+segment.getPath());
		}
		//Logger.getAnonymousLogger().log(Level.INFO, "processed URI:" + uriInfo.getPath());
		//Logger.getAnonymousLogger().log(Level.INFO,"/id/ URI:" + infoUri);
		return infoUri.toString();
		// 
	}

(https://github.com/NRCan/GSIP/blob/master/src/nrcan/lms/gsc/gsip/Information.java)

One solution would be to add a hook to a scripting language (Nashorn) in python or javascript - or a regexp ? that user could configure to provide custom MIR->NIR transformation. Absence of hook would result in current transformation logic (info -> id)

Complex RDF takes lot of time from Jena

Jena takes a lots of time to compile small RDF model and it might now scale very well when we add more data. This is particularly an issue when using EmbeddedStore. We are using full OWL reasoner, maybe we could settle for something less demanding (we need a limited set of functionnalities like symmetrics, inverseOf, transitivity and inheritance (others ).

Another option is to NOT use RDF and rely on a clever relational database for the whole database and just pull RDF from it to create the landing page.

Ontology loading error

On startup I am getting the following error:

Failed to load /usr/local/tomcat/webapps/gsip/repos/gsip/ontology.ttl
CHyLD | org.apache.jena.riot.RiotException: [line: 64, col: 1 ]
Triples not terminated by DOT

Test conneg process to get direct access to some representations

If a /id/ is request in a mime-type that is not one of the 4 mime type supported (or not an hypermedia), and if there is only 1 representation in that format, the conneg can 303 directly to that resource.

eg: /id/ resource is available in .pdf and only one representation if pdf is available, the conneg process will 303 directly to this .pdf. Otherwise, it will return 404, UNLESS the accept header also has one of the aforementioned hypermedia in the preferred list.

Fix format override to use configuration instead of hardcode

Information handler uses this ugly code

if (format != null && format.trim().length() > 0)
		{
			// TODO: use the Configuration instead
			if ("rdf".equalsIgnoreCase(format) || "application/rdf+xml".equalsIgnoreCase(format)) of = InfoOutputFormat.ioRDFXML;
			if ("ttl".equalsIgnoreCase(format) || "text/turtle".equalsIgnoreCase(format)) of = InfoOutputFormat.ioTURTLE;
			if ("xml".equalsIgnoreCase(format) || "text/xml".equalsIgnoreCase(format)) of = InfoOutputFormat.ioXML;
			if ("html".equalsIgnoreCase(format) || "htm".equalsIgnoreCase(format) || "text/html".equalsIgnoreCase(format)) of = InfoOutputFormat.ioHTML;
			if ("json".equalsIgnoreCase(format) || "jsonld".equalsIgnoreCase(format) || "application/ld+json".equalsIgnoreCase(format))  of = InfoOutputFormat.ioJSONLD;
			// otherwise, file not found
			if (of == InfoOutputFormat.ioUnknown)
				return Response.status(HttpStatus.SC_UNSUPPORTED_MEDIA_TYPE).entity(format +" extension is not supported : use html/ttl/rdf/xml/json").build();

			
		}

instead of using list of mime-types in configuration file.

Rework README

Need to reflect the repository name change from GSIP to GeoConnex

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.