Giter VIP home page Giter VIP logo

solrcene's Introduction

This is Chris Mattmann's port of the now combined Apahce Solr and Apache Lucene 
modules. I call it "Solrcene". My port is specifically focused on spatial search 
and supporting it with Solr. 

Basically I've taken the following uncommitted patches from 
Apache JIRA:

SOLR-2073 Geonames.org UpdateProcessor for Spatial
SOLR-2074 GeoRSS ResponseWriter
SOLR-2075 SpatialQParserPlugin and HostIP adaptor
SOLR-2076 Spatial example schema updates
SOLR-2077 Spatial example solconfig updates
SOLR-2079 Expose HttpServletRequest object from SolrQueryRequest object
SOLR-2081 BaseResponseWriter isStreamingDocs causes SingleResponseWriter.end
to be called 2x
SOLR-2082 Geopost.jar for loading geonames data

These contributions were the results of the final project of one my students
in CSCI 572: Search Engines and Information Retrieval [1] at USC this past
Summer (2010), William Quach.. William and I built off of the existing Spatial 
code in Solr to provide some more robustness, and out of the box ease-of-use for 
Spatial. Basically what we did at a high level:

* allow easy loading of Geonames.org data using an UpdateRequestProcessor.
You can simply use the Geopost.jar tool we provided to load up exports from
Geonames.org, which contains city, state, and locale information as well as
coordinates (lat, lon)

* load up geo-tagged data (lat, lon). This was already provided, but we
included some default schema fields to automatically suck these up on
ingestion. For our examples, we grabbed a regular RSS news feed, ran it
through the geonames.org GeoRSS converter, and transformed it to a Solr
input XML file.

* Once Geonames.org data has been loaded, and you've loaded up a couple of
docs that have been geotagged (with lat, lon), you can use our extended
spatial Qparser, to issue queries like:

   {!spatial ct=[city] s=[state] c=[country] d=[search radius]}search text
   e.g. {!spatial ct=Orlando s=FL c=US d=400}NASA

(which in our example looked up articles about NASA and JPL stories within
400 km (or miles) of Orlando, FL in the US).

* allowed for Host IP->lat, lon detection and a pluggable framework to
incorporate services like hostip to provide this functionality. That way, if
a user doesn't include city, state, etc., in their spatial query, they still
get articles "close" to them when using the spatial filter based on host IP
detection. Though this isn't perfect, and largely dependent on the
architectural topology, it's a really good start.

Those are the high level features. We had to fix a bug along the way, and
additionally allow for access to some of the objects that Solr tries to
insulate you from (of course, with good reason), but it's still a fairly
robust spatial solution, and we are incorporating it into several of our
projects here at JPL.

We hope it helps out the community and that folks find it useful.

[1] http://sunset.usc.edu/classes/cs572_2010/

solrcene's People

Contributors

chrismattmann avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.