Giter VIP home page Giter VIP logo

elasticsearch-mocksolrplugin's Introduction

ElasticSearch Mock Solr Plugin

Mock Solr Plugin elasticsearch Lucene/Solr
master 0.20.2 → 0.20.X 3.6.2
1.1.4 0.20.2 → 0.20.X 3.6.2
1.1.3 0.19.3 → 0.20.1 3.6.0
1.1.2 0.19.0 → 0.19.2 3.5.0
1.1.1 0.18.6 → 0.18.7 3.5.0
1.1.0 0.18.0 → 0.18.5 3.5.0

Use Solr clients/tools with ElasticSearch

This plugin will allow you to use tools that were built to
interact with Solr with ElasticSearch.

The idea for this plugin came when I wanted to use Nutch with
ElasticSearch. Instead of extending Nutch itself,
I thought it would be nice to use any Solr clients with
ElasticSearch. Some projects we can now use are
Nutch, Apache ManifoldCF, and any tool using SolrJ. It
should be possible to use non-java tools that write to
Solr using the XML update and request handlers as well.

Supported Solr features

  • Update handlers
    • XML Update Handler (ie. /update)
    • JavaBin Update Handler (ie. /update/javabin)
  • Search handler (ie. /select)
    • Basic lucene queries using the q paramter
    • start, rows, and fl parameters
    • sorting
    • filter queries (fq parameters)
    • hit highlighting (hl, hl.fl, hl.snippets, hl.fragsize, hl.simple.pre, hl.simple.post)
    • faceting (facet, facet.field, facet.query, facet.sort, facet.limit)
  • XML and JavaBin request and response formats

How do you build this plugin?

Use maven to build the package

mvn package

Then install the plugin

# if you've built it locally
$ES_HOME/bin/plugin -url file:./target/releases/elasticsearch-mocksolrplugin-*.zip -install mocksolrplugin

How to use this plugin.

Just point your Solr client/tool to your ElasticSearch instance and appending
/_solr to the url.

http://localhost:9200/${index}/${type}/_solr

${index} – the ES index you want to index/search against. Default “solr”.
${type} – the ES type you want to index/search against. Default “docs”.

Example paths:


// Will search/index against index “solr” and type “docs”
http://localhost:9200/_solr

// Will search/index against index “testindex” and type “docs”
http://localhost:9200/testindex/_solr

// Will search/index against index “testindex” and type “testtype”
http://localhost:9200/testindex/testtype/_solr

Use the client/tool as you would with Solr.

Example SolrJ Indexing

    CommonsHttpSolrServer server = new CommonsHttpSolrServer("http://localhost:9200/testindex/testtype/_solr");
    server.setRequestWriter(new BinaryRequestWriter());
    // we support both xml and SolrBin response writers
    //server.setParser(new XMLResponseParser());
    
    SolrInputDocument doc1 = new SolrInputDocument();
    doc1.addField( "id", "id1", 1.0f );
    doc1.addField( "name", "doc1", 1.0f );
    doc1.addField( "price", 10 );

    SolrInputDocument doc2 = new SolrInputDocument();
    doc2.addField( "id", "id2", 1.0f );
    doc2.addField( "name", "doc2", 1.0f );
    doc2.addField( "price", 20 );
    
    Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
    docs.add( doc1 );
    docs.add( doc2 );
    
    server.add( docs );
    server.commit();

    // deletes work as well
    //server.deleteById("id2");
    //server.commit();

Perform a search and verify the documents were indexed.

Example SolrJ Searching

    CommonsHttpSolrServer server = new CommonsHttpSolrServer("http://localhost:9200/testindex/testtype/_solr");

    String qstr = "id:[* TO *]";
    SolrQuery query = new SolrQuery();
    query.setQuery(qstr);

    QueryResponse response = server.query(query);
    for (SolrDocument doc : response.getResults()) {
        for (String field : doc.getFieldNames()) {
            System.out.println(field + " = " + doc.getFieldValue(field));
        }
        System.out.println();
    }

Example using Nutch

At a minimum, use the following type mapping for ElasticSearch.

curl -XPUT 'http://localhost:9200/testindex'
curl -XPUT 'http://localhost:9200/testindex/testtype/_mapping' -d '{
    "testtype" : {
        "properties" : {
            "id" : {
                "type" : "string",
                "store": "yes"
            },
            "digest" : {
                "type" : "string",
                "store" : "yes",
                "index" : "no"
            },
            "boost" : {
                "type" : "float",
                "store" : "yes",
                "index" : "no"
            },
            "tstamp" : {
                "type" : "date",
                "store" : "yes",
                "index" : "no"
            }
        }
    }
}'

Follow the nutch tutorial at http://wiki.apache.org/nutch/NutchTutorial

  • Follow steps 1 though 3.1
  • For step 3.1 use:
bin/nutch crawl urls -solr http://localhost:9200/testindex/testtype/_solr -depth 3 -topN 5

Notes

ElasticSearch does not require a schema and all the data you send to Solr will be indexed by default. You
Can use the ElasticSearch PUT Mapping API to define your field types, what should be stored, analyzed, etc.
All data that is indexed via the mock XML Update Handler will most likely be detected by ElasticSearch as
strings, thus it is a good idea to mimic your Solr schema with an ElasticSearch type mapping.

elasticsearch-mocksolrplugin's People

Contributors

mattweber avatar

Watchers

James Cloos avatar 张伟 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.