Giter VIP home page Giter VIP logo

elasticsearch-xml's Introduction

Elasticsearch XML Plugin

The XML plugin for Elasticsearch is a simple REST filter for sending and receiving XML.

It converts REST HTTP bodies from JSON to XML. It is hoped to be useful to embed Elasticsearch in XML environments.

For sending XML, you must add a HTTP header Content-type: application/xml

For receiving XML, you must add a HTTP header Accept: application/xml

Each JSON name is converted to a valid XML element name according to ISO 9075.

Because XML is more restrictive than JSON, do not assume that XML can server as a full replacement for JSON in Elasticsearch.

The JSON to XML conversion uses some tricks. Therefore you must not be surprised by edge cases where XML give peculiar results.

Versions

Elasticsearch version Plugin Release date
2.3.5 2.3.5.1 Aug 24, 2016
2.3.5 2.3.5.0 Aug 13, 2016
1.6.0 1.6.0.2 Jul 3, 2015
1.4.2 1.4.2.0 Feb 2, 2015
1.3.2 1.3.0.0 Aug 19, 2014
1.2.2 1.2.2.1 Jul 22, 2014

Installation

Elasticsearch 2.x

./bin/plugin install 'http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-xml/2.3.5.1/elasticsearch-xml-2.3.5.1-plugin.zip'

Elasticsearch 1.x

./bin/plugin --install xml --url http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-xml/1.6.0.2/elasticsearch-xml-1.6.0.2-plugin.zip

Do not forget to restart the node after installing.

Project docs

The Maven project site is available at Github

Examples

Consider the following JSON documents.

Command:

curl '0:9200/_search?pretty'

Output:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 7,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "a",
      "_type" : "b",
      "_id" : "3",
      "_score" : 1.0, "_source" : {"name":"Jörg"}
    }, {
      "_index" : "a",
      "_type" : "b",
      "_id" : "2",
      "_score" : 1.0, "_source" : {"es:foo":"bar"}
    }, {
      "_index" : "a",
      "_type" : "b",
      "_id" : "7",
      "_score" : 1.0, "_source" : {"":"Hello World"}
    }, {
      "_index" : "a",
      "_type" : "b",
      "_id" : "c",
      "_score" : 1.0, "_source" : {"Hello":"World"}
    }, {
      "_index" : "a",
      "_type" : "b",
      "_id" : "6",
      "_score" : 1.0, "_source" : {"@context":{"p":"http://another.org"},"p:foo":"bar"}
    }, {
      "_index" : "a",
      "_type" : "b",
      "_id" : "4",
      "_score" : 1.0, "_source" : {"@context":{"p":"http://example.org"},"p:foo":"bar"}
    }, {
      "_index" : "a",
      "_type" : "b",
      "_id" : "5",
      "_score" : 1.0, "_source" : {"@context":{"p":"http://dummy.org"},"p:foo":"bar"}
    } ]
  }
}

The same in XML.

Command:

curl -H 'Accept: application/xml'  '0:9200/_search?pretty'

Output:

<root xmlns="http://elasticsearch.org/ns/1.0/" xmlns:p="http://dummy.org">
  <took>3</took>
  <timed_out>false</timed_out>
  <shards>
    <total>5</total>
    <successful>5</successful>
    <failed>0</failed>
  </shards>
  <hits>
    <total>7</total>
    <max_score>1.0</max_score>
    <hits>
      <index>a</index>
      <type>b</type>
      <id>3</id>
      <score>1.0</score>
      <source>
        <name>Jörg</name>
      </source>
    </hits>
    <hits>
      <index>a</index>
      <type>b</type>
      <id>2</id>
      <score>1.0</score>
      <source>
        <foo>bar</foo>
      </source>
    </hits>
    <hits>
      <index>a</index>
      <type>b</type>
      <id>7</id>
      <score>1.0</score>
      <source>
        <>Hello World</>
      </source>
    </hits>
    <hits>
      <index>a</index>
      <type>b</type>
      <id>c</id>
      <score>1.0</score>
      <source>
        <Hello>World</Hello>
      </source>
    </hits>
    <hits>
      <index>a</index>
      <type>b</type>
      <id>6</id>
      <score>1.0</score>
      <source>
        <context es:p="http://another.org"/>
        <wstxns1:foo xmlns:wstxns1="http://another.org">bar</wstxns1:foo>
      </source>
    </hits>
    <hits>
      <index>a</index>
      <type>b</type>
      <id>4</id>
      <score>1.0</score>
      <source>
        <context es:p="http://example.org"/>
        <wstxns2:foo xmlns:wstxns2="http://example.org">bar</wstxns2:foo>
      </source>
    </hits>
    <hits>
      <index>a</index>
      <type>b</type>
      <id>5</id>
      <score>1.0</score>
      <source>
        <context es:p="http://dummy.org"/>
        <p:foo>bar</p:foo>
      </source>
    </hits>
  </hits>

As shown above, with the @context name in JSON, you can declare XML namespaces.

The @context is similar to JSON-LD's @context but not that powerful.

XML Attributes

If JSON names are used with a @ as starting letter, they will appear as XML attribute.

If XML attributes are passed in sending documents, they will appear as normal JSON names.

If nested XML do not lead to a proper JSON object, an empty JSON name is used, which might not be useful.

Command:

curl -XPOST -H 'Accept: application/xml' '0:9200/a/c/1' -d '<root><name attr="test">value</name></root>'
curl '0:9200/a/c/1?pretty'

Output:

{
  "_index" : "a",
  "_type" : "c",
  "_id" : "1",
  "_version" : 1,
  "found" : true, "_source" : {"name":{"attr":"test","":"value"}}
}

Another example.

Command:

curl -XPOST '0:9200/a/c/2' -d '{"test":{"@attr": "value"}}'
curl -H 'Accept: application/xml' '0:9200/a/c/2?pretty'

Output:

<root xmlns="http://elasticsearch.org/ns/1.0/" xmlns:es="http://elasticsearch.org/ns/1.0/">
  <index>a</index>
  <type>c</type>
  <id>2</id>
  <version>1</version>
  <found>true</found>
  <source>
    <test es:attr="value"/>
  </source>
</root>

License

Elasticsearch XML Plugin

Copyright (C) 2014 Jörg Prante

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

elasticsearch-xml's People

Contributors

jprante avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-xml's Issues

Error with ver 1.6.0

Hi,

When I use query: "_search?size=10&from=0&pretty&xml" or curl -H 'Accept: application/xml' '_search?pretty'
The error message appear:
Could not initialize class org.xbib.elasticsearch.common.xcontent.xml.XmlXContent

Update ES version to 1.2.x

Could you release a version for ES 1.2.x or explain me what I have to change to make the plugin works with ES 1.2.1 ?

Bulk Adding

Is it possible to bulk add all the immediate children of the root node as separate documents from a large data file?

Can't retrieve more than 10 results with the plugin

Hi,
It seems that there is a problem when querying with a "size" parameter > 10~15 (try with 100 for instance). I keep getting "array not available" as an error message ? Can you check on your side that this is not a bug in your plugin ?
I use elasticsearch v1.2.1 and your latest build.
Besides this, you have done an amazing job for those who unfortunately cannot use json (my user-case is to query ES from excel-vba)

Error with version 1.3.0.0

The error mentionned #3 appear again in the log after the upgrade, but the data are retrieved correctly.

[2014-08-20 10:15:26,113][ERROR][org.xbib.elasticsearch.rest.xml.XmlFilter] com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character '{' (code 123) in prolog; expected '<'
at [row,col {unknown-source}]: [1,1]
java.io.IOException: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character '{' (code 123) in prolog; expected '<'
at [row,col {unknown-source}]: [1,1]
at com.fasterxml.jackson.dataformat.xml.util.StaxUtil.throwXmlAsIOException(StaxUtil.java:24)
at com.fasterxml.jackson.dataformat.xml.XmlFactory._createParser(XmlFactory.java:583)
at com.fasterxml.jackson.dataformat.xml.XmlFactory._createParser(XmlFactory.java:28)
at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:812)
at org.xbib.elasticsearch.common.xcontent.xml.XmlXContent.createParser(XmlXContent.java:109)
at org.xbib.elasticsearch.common.xcontent.xml.XmlXContent.createParser(XmlXContent.java:115)
at org.xbib.elasticsearch.rest.xml.XmlFilter$XmlRequest.content(XmlFilter.java:95)
at org.elasticsearch.rest.action.search.RestSearchAction.parseSearchRequest(RestSearchAction.java:87)
at org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:72)
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:66)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:177)
at org.elasticsearch.rest.RestController$RestHandlerFilter.process(RestController.java:252)
at org.elasticsearch.rest.RestController$ControllerFilterChain.continueProcessing(RestController.java:233)
at org.xbib.elasticsearch.rest.xml.XmlFilter.process(XmlFilter.java:44)
at org.elasticsearch.rest.RestController$ControllerFilterChain.continueProcessing(RestController.java:236)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:170)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:294)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:44)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

Plugin for new releases of Elasticsearch

Hi

Firstly, appreciate the great work done by you on the ES plugins.

I am running ES 2.3.5. How can I run the latest release of elasticsearch-xml plugin on it?

If it's not supported, I have no option but to downgrade my ES to 1.6 :-(

MapperParsingException

When I run the example command:

curl -XPOST -H 'Content-type: application/xml' '0:9200/a/c/1' -d '<root><name attr="test">value</name></root>'

I get this:

{"error":"MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive xcontent from (offset=0, length=43): [60, 114, 111, 111, 116, 62, 60, 110, 97, 109, 101, 32, 97, 116, 116, 114, 61, 34, 116, 101, 115, 116, 34, 62, 118, 97, 108, 117, 101, 60, 47, 110, 97, 109, 101, 62, 60, 47, 114, 111, 111, 116, 62]]; ","status":400}

ES 1.3.x

Sorry to bother you again, could you release a version for ES 1.3.x ?

Arrays not importing

I have a document that contains the following structure:

<AuthorList CompleteYN="Y">
    <Author ValidYN="Y">
        <LastName>Smith</LastName>
        <ForeName>T M</ForeName>
        <Initials>TM</Initials>
        <Affiliation>Some place</Affiliation>
    </Author>
    <Author ValidYN="Y">
        <LastName>Johnson</LastName>
        <ForeName>K P</ForeName>
        <Initials>KP</Initials>
    </Author>
</AuthorList>

When I send this as part of an XML document for processing, I only get the first entry instead of the entire array.

Difficulties in Running Sample Code in README.md

Hi,

I am new to elastic search and am looking to load in a large number of xml files. I was excited to find your plugin but am experiencing difficulties.Following successful install of your plug and restating of ES I attempted to execute the sample code in README.md with the following errors. Perhaps I am doing something incorrect

Code starting: curl '0:9200/_search?pretty'
Error: > "_score" : 1.0, "_source" : {"@context":{"p":"http://dummy.org"},"p:foo":"bar"}

} ]

-bash: syntax error near unexpected token `]'

Code starting: curl -H 'Accept: application/xml' '0:9200/_search?pretty'
Error: There appears to be no quoting of the xml and when I attempt this I get parse exception errors.

As I am new to this I assume that there is something simple I am doing incorrectly and I was hoping that you could guide me.

Kind regards,
Alex

XML Plugin not working when creating document with auto-id

Hi,

It looks like the XML plugin is not called when a document is created, but works perfectly if the document already exists.

ex:

curl -XPOST -H 'Content-type: application/xml' -H 'Accept: application/xml' '0:9200/c/c/new' -d '<root><name>value</name></root>'

return

{"_index":"c","_type":"c","_id":"new","_version":1,"created":true}

Then

curl -XPOST -H 'Content-type: application/xml' -H 'Accept: application/xml' '0:9200/c/c/new' -d '<root><name>value</name></root>'

return

<root xmlns="http://elasticsearch.org/ns/1.0/" xmlns:es="http://elasticsearch.org/ns/1.0/"><index>c</index><type>c</type><id>new</id><version>2</version><created>false</created></root>

Did I miss something ?

regards.

Cyril

Using ES

<number>1.0.0</number><build_hash>a46900e9c72c0a623d71b54016357d5f94c8ea32</build_hash><build_timestamp>2014-02-12T16:18:34Z</build_timestamp><build_snapshot>false</build_snapshot><lucene_version>4.6</lucene_version>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.