jprante / elasticsearch-knapsack Goto Github PK

Knapsack plugin is an import/export tool for Elasticsearch

License: Apache License 2.0

Java 100.00%

elasticsearch-knapsack's Introduction

Image by Rick McCharles CC BY 2.0 https://creativecommons.org/licenses/by/2.0/

Knapsack plugin for Elasticsearch

Knapsack is an "swiss knife" export/import plugin for Elasticsearch. It uses archive formats (tar, zip, cpio) and also Elasticsearch bulk format with compression algorithms (gzip, bzip2, lzf, xz).

A pull or push of indexes or search hits with stored fields across clusters is also supported.

The knapsack actions can be executed via HTTP REST, or in Java using the Java API.

In archive files, the following index information is encoded:

index settings
index mappings
index aliases

When importing archive files again, this information is reapplied.

Compatibility matrix

Elasticsearch	Plugin	Release date
2.3.4	2.3.4.0	Aug 4, 2016
2.3.3	2.3.3.0	May 23, 2016
2.3.1	2.3.1.0	Apr 21, 2016
2.3.0	2.3.0.0	Mar 31, 2016
2.2.1	2.2.1.0	Mar 31, 2016
2.1.2	2.1.2.0	Mar 23, 2016
2.2.0	2.2.0.0	Feb 23, 2016
2.1.1	2.1.1.0	Dec 30, 2015
2.1.0	2.1.0.0	Dec 7, 2015
2.0.0	2.0.0.0	Nov 14, 2015
2.0.0-rc1	2.0.0-rc1.0	Oct 12, 2015

For older releases and 1.x versions, see the repective branches.

Installation 2.x

./bin/plugin install http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-knapsack/2.3.4.0/elasticsearch-knapsack-2.3.4.0-plugin.zip

Do not forget to restart the node after installation.

Note: If you get an error while exporting or importing like this

{"error":{"root_cause":[{"type":"access_control_exception","reason":"access denied (\"java.io.FilePermission\" \"/foo/bar.zip\" \"read\")"}],"type":"access_control_exception","reason":"access denied (\"java.io.FilePermission\" \"/foo/bar.zip\" \"read\")"},"status":500}

then you are blocked by the Elasticsearch 2.x security manager. In this case, choose another directory for reading/writing archive files, preferably path.logs.

It is recommended to add a node with knapsack plugin installed only, no data, no master, and removing the node after the export/import completed.

Project docs

The Maven project site is available at Github

Overview

Example

Let's go through a simple example:

curl -XDELETE localhost:9200/test
curl -XPUT localhost:9200/test/test/1 -d '{"key":"value 1"}'
curl -XPUT localhost:9200/test/test/2 -d '{"key":"value 2"}'

Export

You can export this Elasticsearch index with

curl -XPOST localhost:9200/test/test/_export
{"running":true,"state":{"mode":"export","started":"2015-10-12T18:13:47.214Z","path":"file:///Users/es/elasticsearch-2.0.0-rc1/logs/_all.tar.gz","node_name":"Doctor Bong"}}

The result is a file in the Elasticsearch path.logs folder

-rw-r--r--   1 joerg  staff          343 28 Sep 21:18 test_test.tar.gz

Check with tar utility, the settings and the mapping is also exported

tar ztvf test_test.tar.gz 
-rw-r--r--  0 joerg  0         133 28 Sep 21:18 test/_settings/null/null
-rw-r--r--  0 joerg  0          49 28 Sep 21:18 test/test/_mapping/null
-rw-r--r--  0 joerg  0          17 28 Sep 21:18 test/test/1/_source
-rw-r--r--  0 joerg  0          17 28 Sep 21:18 test/test/2/_source

Also, you can export a whole index with

curl -XPOST localhost:9200/test/_export

with the archive file test.tar.gz, or even all cluster indices with

curl -XPOST 'localhost:9200/_export'

to the file _all.tar.gz

Available suffixes for archive formats

.tar
.zip
.cpio
.bulk

Available suffixes for compression

.gz
.bzip2
.xz
.lzf

By default, the archive format is tar with compression gz (gzip).

You can also export to zip, cpio or bulk archive format.

Available compression codecs are bz2 (bzip2), xz (Xz), or lzf (LZF)

Note: if you use the bulk format, you create Elasticsearch bulk format.

Export search results

You can add a query to the _export endpoint just like you would do for searching in Elasticsearch.

curl -XPOST 'localhost:9200/test/test/_export' -d '{
   "query" : {
       "match_phrase" : {
           "key" : "value 1"
       }
   },
   "fields" : [ "_parent", "_source" ]
}'

Export to an archive with a given archive path name

You can configure an archive path with the parameter archivepath

curl -XPOST 'localhost:9200/test/_export?archivepath=/tmp/myarchive.zip'

If Elasticsearch can not write to the archive path, an error message will appear, and no export will take place.

Note: Elasticsearch 2.x has a security manager enabled by default which prevents reading/writing to locations outside of Elasticsearch directories. Therefore, the default location for export/import is set to the path.logs directory. If you prefer to write to or read from any locations, you can disable the security manager by

./bin/elasticsearch ... -Dsecurity.manager.enabled=false

Existing archive files are not overwritten. You can force overwrite with the parameter overwrite=true

Export split by byte size

You can create multiple archive files with the parameter bytes

curl -XPOST 'localhost:9200/test/_export?archivepath=/tmp/myindex.bulk&bytes=10m'

This creates myindex.bulk, 1.myindex.bulk, 2.myindex.bulk ... where all archive files are around 10 megabytes.

Renaming indexes and index types

You can rename indexes and index types by adding a map parameter that contains a JSON object with old and new index (and index/type) names.

curl -XPOST 'localhost:9200/test/type/_export?map=\{"test":"testcopy","test/type":"testcopy/typecopy"\}'

Note the backslash, which is required to escape shell interpretation of curly braces.

Push or pull indices from one cluster to another

If you want tp push or pull indices from one cluster to another, Knapsack is your friend.

You can copy an index in the local cluster or to a remote cluster with the _push or the _pull endpoint. This works if you have the same Java JVM version and the same Elasticsearch version.

Example for a local cluster copy of the index test to testcopy

curl -XPOST 'localhost:9200/test/_push?map=\{"test":"testcopy"\}'

Example for a remote cluster copy of the index test by using the parameters cluster, host, and port

curl -XPOST 'localhost:9200/test/_push?&cluster=remote&host=127.0.0.1&port=9201'

This is a complete example that illustrates how to filter an index by timestamp and copy this part to another index

curl -XDELETE 'localhost:9200/test'
curl -XDELETE 'localhost:9200/testcopy'
curl -XPUT 'localhost:9200/test/' -d '
{
    "mappings" : {
        "_default_": {
            "_timestamp" : { "enabled" : true, "store" : true, "path" : "date" }
        }
    }
}
'
curl -XPUT 'localhost:9200/test/doc/1' -d '
{
    "date" : "2014-01-01T00:00:00",
    "sentence" : "Hi!",
    "value" : 1
}
'
curl -XPUT 'localhost:9200/test/doc/2' -d '
{
    "date" : "2014-01-02T00:00:00",
    "sentence" : "Hello World!",
    "value" : 2
}
'
curl -XPUT 'localhost:9200/test/doc/3' -d '
{
    "date" : "2014-01-03T00:00:00",
    "sentence" : "Welcome!",
    "value" : 3
}
'
curl 'localhost:9200/test/_refresh'
curl -XPOST 'localhost:9200/test/_push?map=\{"test":"testcopy"\}' -d '
{
    "fields" : [ "_timestamp", "_source" ],
    "query" : {
         "filtered" : {
             "query" : {
                 "match_all" : {
                 }
             },
             "filter" : {
                "range": {
                   "_timestamp" : {
                       "from" : "2014-01-02"
                   }
                }
             }
         }
     }
}
'
curl '0:9200/test/_search?fields=_timestamp&pretty'
curl '0:9200/testcopy/_search?fields=_timestamp&pretty'

Import

You can import the file with the _import endpoint

curl -XPOST 'localhost:9200/test/test/_import'

Knapsack does not delete or overwrite data by default. But you can use the parameter createIndex with the value false to allow indexing to indexes that exist.

When importing, you can map your indexes or index/types to your favorite ones.

curl -XPOST 'localhost:9200/test/_import?map=\{"test":"testcopy"\}'

Modifying settings and mappings

You can overwrite the settings and mapping when importing by using parameters in the form <index>_settings=<filename> or <index>_<type>_mapping=<filename>.

General example::

curl -XPOST 'localhost:9200/myindex/mytype/_import?myindex_settings=/my/new/mysettings.json&myindex_mytype_mapping=/my/new/mapping.json'

The following statements demonstrate how you can change the number of shards from the default 5 to 1 and replica from 1 to 0 for an index test

curl -XDELETE localhost:9200/test
curl -XPUT 'localhost:9200/test/test/1' -d '{"key":"value 1"}'
curl -XPUT 'localhost:9200/test/test/2' -d '{"key":"value 2"}'
curl -XPUT 'localhost:9200/test2/foo/1' -d '{"key":"value 1"}'
curl -XPUT 'localhost:9200/test2/bar/1' -d '{"key":"value 1"}'
curl -XPOST 'localhost:9200/test/_export'
tar zxvf test.tar.gz test/_settings
echo '{"index.number_of_shards":"1","index.number_of_replicas":"0"}' > test/_settings/null/null
curl -XDELETE 'localhost:9200/test'
curl -XPOST 'localhost:9200/test/_import?test_settings=test/_settings/null/null'
curl -XGET 'localhost:9200/test/_settings?pretty'
curl -XPOST 'localhost:9200/test/_search?q=*&pretty'

The result is a search on an index with just one shard.

{
  "took" : 19,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "test",
      "_type" : "test",
      "_id" : "1",
      "_score" : 1.0,
      "_source":{"key":"value 1"}
    }, {
      "_index" : "test",
      "_type" : "test",
      "_id" : "2",
      "_score" : 1.0,
      "_source":{"key":"value 2"}
    } ]
  }
}

State of knapsack import/export actions

While exports or imports or running, you can check the state with

curl -XPOST 'localhost:9200/_export/state'

curl -XPOST 'localhost:9200/_import/state'

Aborting knapsack actions

If you want to abort all running knapsack exports/import, you can do this by

curl -XPOST 'localhost:9200/_export/abort'

curl -XPOST 'localhost:9200/_import/abort'

Handing Parent/Child documents

Exporting

Handling dependant documents is bit tricky since indexing a child document requires the presence of its parent document. A simple approach is to export the documents into seperate archives by using a query. In case your child documents are located in the same type as the parent documents, define the appropriate filter in the query. If you have stored the child documents in a seperate type, you can export the type containing the parent documents like this:

curl -XPOST 'localhost:9200/myIndex/myParentDocs/_export?archivepath=/tmp/myIndex_myParentDocs.zip'

When exporting the type containing the child documents, include the "_parent" meta field

curl -XPOST 'localhost:9200/myIndex/myChildDocs/_export?archivepath=/tmp/myIndex_myChildDocs.zip'' -d '{
   "query" : {
       "match_all" : {
       }
   },
   "fields" : [ "_parent", "_source" ]
}'

Importing Parent/Child documents

Before you import the parent documents, you have to create the index manually first: Each type export only contains the mapping of that spedific type and you cannot add a dependant mapping in a second step later. All dependant mappings must be created at the same time otherwise you'll get an error like "java.lang.IllegalArgumentException: Can't specify parent if no parent field has been configured". After creating the index, import the parent documents:

curl -XPOST 'localhost:9200/myIndex/myParentDocs/_import?archivepath=/tmp/myIndex_myParentDocs.zip&createIndex=false'

Then import the child documents:

curl -XPOST 'localhost:9200/myIndex/myChildDocs/_import?archivepath=/tmp/myIndex_myChildDocs.zip&createIndex=false'

Repeat this for all your child types.

Java API

Knapsack implements all actions as Java transport actions in Elasticsearch.

You can consult the junit tests for finding out how to use the API. To give you an impression, here is just an example for a very minimal export/import cycle using the bulk archive format.

    client.index(new IndexRequest().index("index1").type("test1").id("doc1")
         .source("content","Hello World").refresh(true)).actionGet();
    
    File exportFile = File.createTempFile("minimal-import-", ".bulk");
    Path exportPath = Paths.get(URI.create("file:" + exportFile.getAbsolutePath()));
    KnapsackExportRequestBuilder requestBuilder = new KnapsackExportRequestBuilder(client.admin().indices())
            .setArchivePath(exportPath)
            .setOverwriteAllowed(false);
    KnapsackExportResponse knapsackExportResponse = requestBuilder.execute().actionGet();

    KnapsackStateRequestBuilder knapsackStateRequestBuilder =
           new KnapsackStateRequestBuilder(client.admin().indices());
    KnapsackStateResponse knapsackStateResponse = knapsackStateRequestBuilder.execute().actionGet();

    Thread.sleep(1000L);

    client.admin().indices().delete(new DeleteIndexRequest("index1")).actionGet();

    KnapsackImportRequestBuilder knapsackImportRequestBuilder = new KnapsackImportRequestBuilder(client.admin().indices())
            .setArchivePath(exportPath);
    KnapsackImportResponse knapsackImportResponse = knapsackImportRequestBuilder.execute().actionGet();

Caution

Knapsack is very simple and works without locks or snapshots. This means, if Elasticsearch is allowed to write to the part of your data in the export while it runs, you may lose data in the export. So it is up to you to organize the safe export and import with this plugin.

If you want a more advanced feature, please use the snapshot/restore which is the standard procedure for saving/restoring data in Elasticsearch:

http://www.elasticsearch.org/blog/introducing-snapshot-restore/

Credits

Knapsack contains derived work of Apache Common Compress http://commons.apache.org/proper/commons-compress/

The code in this component has many origins: The bzip2, tar and zip support came from Avalon's Excalibur, but originally from Ant, as far as life in Apache goes. The tar package is originally Tim Endres' public domain package. The bzip2 package is based on the work done by Keiron Liddle as well as Julian Seward's libbzip2. It has migrated via: Ant -> Avalon-Excalibur -> Commons-IO -> Commons-Compress. The cpio package is based on contributions of Michael Kuss and the jRPM project.

Thanks to nicktgr15 <https://github.com/nicktgr15> for extending Knapsack to support Amazon S3.

License

Knapsack Plugin for Elasticsearch

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

elasticsearch-knapsack's People

Contributors

Stargazers

Watchers

elasticsearch-knapsack's Issues

Add Support for Specifying Index Creation Timeout

I have a 45MB synonyms file. When creating an index that uses the synonyms filter, I receive {"ok":true,"acknowledged":false} because the request times out after 30 seconds. This prevents me from using the import functionality.

The index API has support for specifying a timeout. It would be great if it were exposed:

curl -XPOST localhost:9200/test/test/_import?timeout=1m

"error":"FileNotFoundException when exporting

Hi guys, we are trying to use knapsack for our backups (60GB+) but we always have this error:

curl -XPOST localhost:9200/_export
{"error":"FileNotFoundException[_all.tar.gz (Permission denied)]","status":500}

We are in the machine via ssh and have tried it in root but no luck, any idea?

Thanks

Cannot change the number of shards using knapsack

I am attempting to follow the instructions for changing the number of shards on an index. Here is a link to the tutorial:

https://github.com/jprante/elasticsearch-knapsack

curl -XDELETE localhost:9200/test
curl -XPUT 'localhost:9200/test/test/1' -d '{"key":"value 1"}'
curl -XPUT 'localhost:9200/test/test/2' -d '{"key":"value 2"}'
curl -XPUT 'localhost:9200/test/foo/1' -d '{"key":"value 1"}'
curl -XPUT 'localhost:9200/test/bar/1' -d '{"key":"value 1"}'
curl -XPOST 'localhost:9200/test/_export?path=/home/thalej/ESData/test.tar.gz'
tar zxvf test.tar.gz test/_settings
echo '{"index.number_of_shards":"1","index.number_of_replicas":"0","index.version.created":"200199"}' > test/_settings
curl -XDELETE 'localhost:9200/test'
curl -XPOST 'localhost:9200/test/_import?test_settings=test/_settings&path=/home/thalej/ESData/test.tar.gz'
curl -XGET 'localhost:9200/test/_settings?pretty'
curl -XPOST 'localhost:9200/test/_search?q=*&pretty'

The only difference between the tutorial and what I posted is that I supply the path to the test.tar.gz file.

When I unzip the file using: "tar zxvf test.tar.gz test/_settings", it creates a file called:
"test/_settings/null/null", with seems a bit odd, but ok I guess. But then when I run the next command:
echo '{"index.number_of_shards":"1","index.number_of_replicas":"0","index.version.created":"200199"}' > test/_settings

It fails because a directory was made (test/_settings) from un-compressing the tar file so the echo command cannot echo to a directory.

So my question is this: Has anyone had success exporting to a file path, changing the number of shards and then importing the export file? If so, your posted example would be much appreciated. Thanks

knapsack question

I installed knapsack on elasticsearch v0.90.1 successfully.

I tried to export an index named g4 per the README instructions. My command and return value is as follows:

curl -XPOST "http://localhost:9200/g4/file/_export"
{"error":"MapperParsingException[failed to parse, document is empty]","status":400}

As you can see the result was an error. Am I supposed to be using curl's -d option to pass in a JSON document, or should this have successfully exported?

Failed when exporting

Hi,

I use ES 0.20.1. I installed this plugin. While exporting (curl -XPOST localhost:9200/users/chat/_export) I get this error:
[2012-12-10 14:41:15,427][INFO ][rest.action ] [Cameron Hodge] starting export to users_chat
Exception in thread "[Exporter Thread users_chat]" java.lang.NoClassDefFoundError: Could not initialize class org.xbib.io.StreamCodecService
at org.elasticsearch.plugin.knapsack.io.tar.TarSession.(TarSession.java:38)
at org.elasticsearch.plugin.knapsack.io.tar.TarConnection.createSession(TarConnection.java:50)
at org.elasticsearch.plugin.knapsack.io.tar.TarConnection.createSession(TarConnection.java:32)
at org.elasticsearch.rest.action.RestExportAction$1.run(RestExportAction.java:119)

And nothing more happens.

Indice import issue

On box 1 running elasticsearch 0.90.9 (plugin 0.90.9.1) I'm exporting indice logstash-2014.03.05:

curl -XPOST 'localhost:9200/logstash-2014.03.05/_export'

On box 2 running elasticsearch 0.90.12 (plugin 0.90.11.2) I'm importing the indice:

curl -XPOST 'localhost:9200/logstash-2014.03.05/_import'

The indice seems to have imported properly on box 2:

curl localhost:9200/_aliases?pretty

{
  "kibana-int" : {
    "aliases" : { }
  },
  "logstash-2014.03.05" : {
    "aliases" : { }
  }
}

But nothing is returned when doing _search on box 2:

curl localhost:9200/logstash-2014.03.05/_search?pretty

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

What is missing?

Thanks,
Adrien

Support for 1.3.0

I've installed latest knapsack under ES 1.3.0 but it doesn't seem to work, the cluster won't go up after restarting it. curl localhost:9200 gives curl: (7) couldn't connect to host, after removing knapsack it goes up again.

Anybody else got knapsack working in 1.3.0?

[2014-07-31 08:32:38,184][INFO ][plugins ] [... master] loaded [knapsack-1.2.0.0-ac08ffe, support-1.2.0.0-af78fcc], sites []
[2014-07-31 08:32:38,372][WARN ][plugins ] [... master] plugin support-1.2.0.0-af78fcc, failed to invoke custom onModule method

Permission Denied when exporting

I got the following error, when start exporting:

[2013-04-25 17:59:09,217][INFO ][rest.action              ] [boston] starting export to entities
[2013-04-25 17:59:09,218][ERROR][rest.action              ] [boston] entities.tar.gz (Permission denied)
java.io.FileNotFoundException: entities.tar.gz (Permission denied)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:160)
        at org.xbib.io.tar.TarSession.createFileOutputStream(TarSession.java:185)
        at org.xbib.io.tar.TarSession.open(TarSession.java:112)
        at org.elasticsearch.rest.action.RestExportAction$1.run(RestExportAction.java:128)
        at java.lang.Thread.run(Thread.java:679)

To which directory your plugin tries to export the data?
I installed the application to /usr/local/elasticsearch:

$ ls -la
insgesamt 48
drwxr-xr-x  6 elasticsearch elasticsearch  4096 Apr 25 17:34 .
drwxr-xr-x 13 root          root           4096 Apr 25 16:02 ..
drwxr-xr-x  2 elasticsearch elasticsearch  4096 Apr 17 22:15 bin
drwxr-xr-x  2 elasticsearch elasticsearch  4096 Apr 18 15:54 config
drwxr-xr-x  3 elasticsearch elasticsearch  4096 Apr 17 22:15 lib
-rwxr-xr-x  1 elasticsearch elasticsearch 11358 Okt 14  2012 LICENSE.txt
-rwxr-xr-x  1 elasticsearch elasticsearch   165 Okt 14  2012 NOTICE.txt
drwxr-xr-x  3 elasticsearch elasticsearch  4096 Apr 25 17:35 plugins
-rwxr-xr-x  1 elasticsearch elasticsearch  7935 Okt 14  2012 README.textile

The data path is /usr/local/var/data/elasticsearch:

$ ls -la
insgesamt 12
drwxr-xr-x 3 elasticsearch elasticsearch 4096 Apr 17 22:16 .
drwxr-xr-x 3 root          root          4096 Apr 17 22:15 ..
drwxrwxr-x 3 elasticsearch elasticsearch 4096 Apr 17 22:16 elasticsearch

Thanks in advance

Invalid formatting in .bulk archive format

I am admittedly not very familiar with what is happening behind the scenes when you output to .bulk, but it appears as though the output is incorrect, as it it does not produce valid JSON.

Here is a sample of two logs dumped with knapsack and .bulk:

{"index":{"_index":"logstash-2014.10.01","_type":"traffic","_id":"soLaJFfsQKe1DwvSr0zs-g"}
{"message":"<188>date=2014-10-01 time=13:44:53 devname=fw01 devid=asdf logid=0000000011 type=traffic subtype=forward level=warning vd=root srcip=10.1.0.52 srcname=Comp srcport=4107 srcintf=\"internal1\" dstip=98.165.205.106 dstport=16437 dstintf=\"wan1\" sessionid=229855382 action=ip-conn policyid=1 crscore=1375731722 craction=262144","@version":"1","@timestamp":"2014-10-01T20:44:19.556Z","type":"traffic","tags":["fortigate"],"host":"10.1.0.1","<188>date":"2014-10-01","time":"13:44:53","devname":"fw01","devid":"asdf","logid":"0000000011","subtype":"forward","level":"warning","vd":"root","srcip":"10.1.0.52","srcname":"Comp","srcport":"4107","srcintf":"internal1","dstip":"98.165.205.106","dstport":"16437","dstintf":"wan1","sessionid":"229855382","action":"ip-conn","policyid":"1","crscore":"1375731722","craction":"262144"}
{"index":{"_index":"logstash-2014.10.01","_type":"traffic","_id":"o6CM9OukSqKT4G3sxmH9AA"}
{"message":"<188>date=2014-10-01 time=12:52:04 devname=fw01 devid=asdf logid=0000000011 type=traffic subtype=forward level=warning vd=root srcip=10.10.80.101 srcport=34647 srcintf=\"wifi\" srcssid=\"zerocool\" dstip=72.21.81.96 dstport=80 dstintf=\"wan1\" sessionid=228820382 action=ip-conn policyid=2 crscore=1375731722 craction=262144","@version":"1","@timestamp":"2014-10-01T19:51:30.859Z","type":"traffic","tags":["fortigate"],"host":"10.1.0.1","<188>date":"2014-10-01","time":"12:52:04","devname":"fw01","devid":"asdf","logid":"0000000011","subtype":"forward","level":"warning","vd":"root","srcip":"10.10.80.101","srcport":"34647","srcintf":"wifi","srcssid":"zerocool","dstip":"72.21.81.96","dstport":"80","dstintf":"wan1","sessionid":"228820382","action":"ip-conn","policyid":"2","crscore":"1375731722","craction":"262144"}

Here is the output from elasticdump that seems to be providing the same (or similar) output:

[
{"_index":"logstash-2014.10.01","_type":"traffic","_id":"soLaJFfsQKe1DwvSr0zs-g","_score":0,"_source":{"message":"<188>date=2014-10-01 time=13:44:53 devname=fw01 devid=asdf logid=0000000011 type=traffic subtype=forward level=warning vd=root srcip=10.1.0.52 srcname=Comp srcport=4107 srcintf=\"internal1\" dstip=98.165.205.106 dstport=16437 dstintf=\"wan1\" sessionid=229855382 action=ip-conn policyid=1 crscore=1375731722 craction=262144","@version":"1","@timestamp":"2014-10-01T20:44:19.556Z","type":"traffic","tags":["fortigate"],"host":"10.1.0.1","<188>date":"2014-10-01","time":"13:44:53","devname":"fw01","devid":"asdf","logid":"0000000011","subtype":"forward","level":"warning","vd":"root","srcip":"10.1.0.52","srcname":"Comp","srcport":"4107","srcintf":"internal1","dstip":"98.165.205.106","dstport":"16437","dstintf":"wan1","sessionid":"229855382","action":"ip-conn","policyid":"1","crscore":"1375731722","craction":"262144"}}
,{"_index":"logstash-2014.10.01","_type":"traffic","_id":"o6CM9OukSqKT4G3sxmH9AA","_score":0,"_source":{"message":"<188>date=2014-10-01 time=12:52:04 devname=fw01 devid=asdf logid=0000000011 type=traffic subtype=forward level=warning vd=root srcip=10.10.80.101 srcport=34647 srcintf=\"wifi\" srcssid=\"zerocool\" dstip=72.21.81.96 dstport=80 dstintf=\"wan1\" sessionid=228820382 action=ip-conn policyid=2 crscore=1375731722 craction=262144","@version":"1","@timestamp":"2014-10-01T19:51:30.859Z","type":"traffic","tags":["fortigate"],"host":"10.1.0.1","<188>date":"2014-10-01","time":"12:52:04","devname":"fw01","devid":"asdf","logid":"0000000011","subtype":"forward","level":"warning","vd":"root","srcip":"10.10.80.101","srcport":"34647","srcintf":"wifi","srcssid":"zerocool","dstip":"72.21.81.96","dstport":"80","dstintf":"wan1","sessionid":"228820382","action":"ip-conn","policyid":"2","crscore":"1375731722","craction":"262144"}}
...
]

It looks like the output is putting on two lines what should be a single JSON object. Additionally, even if you take the two lines together, there are some missing curly braces (even on the first line alone there is a missing curly brace).

The single line output seems to make the most sense, but any valid JSON would suite my purposes.

"client is closed" error during import

Seems to happen intermittently but when it does it cancels the whole import which then needs to be restarted:

[2014-07-11 14:28:08,202][ERROR][org.xbib.elasticsearch.action.RestExportAction] [Count Nefaria] client is closed
org.elasticsearch.ElasticsearchIllegalStateException: client is closed
        at org.xbib.elasticsearch.support.client.bulk.BulkTransportClient.index(BulkTransportClient.java:266)
        at org.xbib.elasticsearch.action.RestExportAction$ExportThread.run(RestExportAction.java:350)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

On ES 1.2.1 running in cluster mode with 4 nodes.

Exports list returned by /_export/state is empty

Hi,

While exporting an index using

curl -XPOST localhost:9200/indexname/_export?target=/tmp/indexname.tar.gz

I can't verify the state of exports calling

curl -XGET localhost:9200/_export/state

as it returns an empty list:

{"exports":[]}

There was definitely an export under process though as I could see the .tar.gz file size increasing.

Regards,
Nick

Issues with umlauts and other special chars

When i export my index and import it again, Umlauts are broken. When going through the files in the generated tar.gz file, i see all special chars replaced by "?". I do the export using "curl -XPOST 'localhost:9200/_export?target=/tmp/test.tar.gz'"

Example:

étudiants encadrés => ?tudiants encadr?s
für => f?r

Am i missing something here, or maybe hitting a bug?

Allow override of mapping and settings for each (Index,Type) or for each (Index) or for all index

For example :

You can overwrite the settings and mapping when importing by using parameters in the form [<index>]_settings=<filename> or [<index>[_<type>]]_mapping=<filename>.

different settings/mappings

Hi,

nice idea! I will add more parameters to _import so different "_settings" and "_mapping" files can be used from anywhere in the filesystem, overrding the tar archive content.

I think of the pattern

<indexname>_settings=<filename>

and

<indexname>_<mappingname>_mapping=<filename>

Best regards,

Jörg

Hello,
We're using your Knapsack plugin to seed data to our production ES cluster. Great tool! Thank you so much.
I was wondering if you've thought about the use case where more shards are needed at the import location. Right now I'm just extracting the exported tarball, modifying index/_settings, and rebuilding the tarball. This, of course, can take quite a long while when there are over a million documents.
Would it make sense if either the _import or _export handlers accepted a new number of shards?

Can't install manually

After installing knapsack manually, by uncompressing the zip in plugins/knapsack. the contents of the zip are a knapsack jar and commons-compression.

[2013-10-25 11:11:37,160][DEBUG][node ] [Bloom, Astrid] using home [/usr/share/elasticsearch], config [/etc/elasticsearch], data [[/var/lib/elasticsearch]], logs [/var/log/elasticsearch], work [/tmp/elasticsearch], plugins [/usr/share/elasticsearch/plugins]
[2013-10-25 11:11:37,168][INFO ][plugins ] [Bloom, Astrid] loaded [knapsack], sites [head]

if I try myhost/myindex/_export I get

No handler found for uri [/myindex/_export] and method [GET]

how can I run it?

S3 export doesn't support files over 5GB in size

S3 supports objects over 5GB in size with the multipart upload capability. Also seems to support streaming with the 'high level' api: http://docs.aws.amazon.com/AmazonS3/latest/dev/HLuploadFileJava.html

RoutingMissingException for import of child documents

Importing child documents results in an org.elasticsearch.action.RoutingMissingException, since the parent id is required for POSTing child documents. Perhaps you could use some directory convention on export whereby all child records are written to the directory of their parent or something like that.

Supporto for ES >= 1.3

Hi,

could it be possible to have a commit for install plugin in ES 1.3 ?

i try to rebuild it with new ES sources but pom.xml give me alot of dipendences errors and it cant found some dependeces.

Data not imported when index does not exist (ES 1.1); works when index pre-created

When trying to import data from a .tar.gz file using ElasticSearch 1.1, the data import operation does not appear to succeed if the index is not created first. The operation works if I send a POST to create the index, then run the _import command with createIndex=false in the URL.

For example, here is a sequence with the test data in a clean ES 1.1 environment (loaded plugins are knapsack and bigdesk). I am exporting and then importing to the same instance, clearing the index in between operations.

# delete test index to ensure it no longer exists
curl -XDELETE 'http://localhost:9200/test'

# create new test index and populate with data
curl -XPUT localhost:9200/test/test/1 -d '{"key":"value 1"}'
curl -XPUT localhost:9200/test/test/2 -d '{"key":"value 2"}'

# make sure entries exist (2 hits)
curl -XGET 'localhost:9200/test/test/_search?q=*&pretty'

# export index
curl -XPOST localhost:9200/test/test/_export?path=/tmp/test_export.tar.gz

# clear test index so we have a clean slate, confirm 404 missing index on first search and 0 hits on second search
curl -XDELETE localhost:9200/test
curl -XGET 'localhost:9200/test/test/_search?q=*&pretty'
curl -XGET 'localhost:9200/_search?q=*&pretty'

# run import operation
curl -XPOST 'localhost:9200/test/test/_import?path=/tmp/test_export.tar.gz'

# check for query results - 0 hits received, should have 2
curl -XGET 'localhost:9200/test/test/_search?q=*&pretty'

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

The following sequence works properly after exporting the same set of data:

# clear 'test' index
curl -XDELETE localhost:9200/test

# confirm no search results (IndexMissingException)
curl -XGET 'localhost:9200/test/test/_search?q=*&pretty'

# create index
curl -XPOST localhost:9200/test

# import data, do not create index
curl -XPOST 'localhost:9200/test/test/_import?path=/tmp/test_export.tar.gz&createIndex=false'

# query and make sure items are present
curl -XGET 'localhost:9200/test/test/_search?q=*&pretty'

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "test",
      "_type" : "test",
      "_id" : "2",
      "_score" : 1.0, "_source" : {"key":"value 2"}
    }, {
      "_index" : "test",
      "_type" : "test",
      "_id" : "1",
      "_score" : 1.0, "_source" : {"key":"value 1"}
    } ]
  }
}

Could this be an issue with exporting and then importing the index to the same environment? This problem also appears to occur when I export/import all indices from the ElasticSearch environment as well; the POST method to create an index needs to be explicitly called rather than letting knapsack create the index.

List exports in progress

It would be fantastic if there was a way to query for exports in progress. Im rather new to ElasticSearch but I think there might be atleast two options:

GET on _export returns if an export is running on that type/index/_all (maybe with a running total?).
POST on _export leads to metadata being set on the index. This could be a timestamp for the most recent export and an additional value if the export is running, finished or aborted.

Support for elasticsearch >= 1.1.0

Does elasticsearch-knapsack support 1.1.0 and later?
Thanks

Support of ES 1.4

Hi!

First of all thank you for this great plugin, nice job and really usefull!!!

I wanted to ask if you plan to upgrade plugin to support ES 1.4 in the nearest future? It seems like it totally breaks ES after installation for now...

Adding support for backups to AWS S3

Hello,

We are working on extending knapsack in order to support backups to AWS S3. However, the AWS Java SDK is about 20mb while knapsack is roughly 2mb. Do you consider the increased file size of the package an issue?

If size is an issue, we could probably use maven assembly plugin to generate two different packages (i.e. elasticsearch-knapsack-2.5.1.zip, elasticsearch-knapsack-aws-2.5.1.zip).
Of course, we are open to suggestions or alternative approaches.

Regards,
Nick

Is there any support for 0.90.2 elasticsearch?

Will the latest 0.90.11 version work against an elastic cluster that is 0.90.2? I'm guessing not...

Does not found any archive file

Hi,

I am using the same steps

curl -XDELETE localhost:9200/test
curl -XPUT localhost:9200/test/test/1 -d '{"key":"value 1"}'
curl -XPUT localhost:9200/test/test/2 -d '{"key":"value 2"}'"
curl -XPOST localhost:9200/test/test/_export

In the result I am having

{"ok":true}

But I can not find any archive file in any directory. Using ES ElasticSearch Version: 0.20.5, JVM: 23.3-b01 and Java

java version "1.7.0_07"
Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
Java HotSpot(TM) Server VM (build 23.3-b01, mixed mode)

Error gunzipping bulk.gz files

After exporting to an index.bulk.gz file, I am unable to gunzip the data. It works fine for tar.gz, but there seems to be an issue with bulk. I get the following error:

gzip: myindex.bulk.gz: unexpected end of file

Null Pointer Exception on Export

Hi,
I get the following error
{"error":"NullPointerException[null]","status":500}
when trying to export data using the url to export the data from a index called "notes"
http://localhost:9200/notes/_export

I get a similar exception when i try using the export on the elasticSearch node using http://localhost:9200/_export to export the data from all indices associated with the node. Any ideas on what is causing this? Are there any logs that would help find the underlying issue?

Issue with importing exported items with knapsack

I am using knapsack to export the index items to a tar file. One of the pecularirities about the documents in my elasticsearch instance is that theire id's are in the form of

"testcustomer/testservice/video(572)"

An example json of what the document looks like is

{ "_id" : "testcustomer/testservice/video(572)", "Title" : "Crashes", "SourceId" : "572", "Counters" : { }, "Relationships" : { }, "BlockedReason" : [], "BlockedReasonCount" : 0, "Version" : 1, "Locales" : { "Invariant" : { "Metadata" : { "itemXmlRootNode" : { "@stringAttribute" : "dave", "@xmlns:metacafe" : "http://www.metacafe.com/schema/", "@xmlns:media" : "http://search.yahoo.com/mrss/", "pubDate" : "2005-07-06T16:53:07.000Z", "metacafe:itemID" : 572.0, "media:description" : "Can you explain how did that happen???", "media:title" : "Crashes", "media:keywords" : "Cars Crash Accidents", "@@media:keywords" : { "@Label" : "Tags" }, "media:rating" : "nonadult", "##media:rating" : null, "@@media:rating" : { "@scheme" : "urn:simple" }, "metacafe:controversial" : "noncontroversial", "metacafe:intProperty" : "1" } }, "ModifiedDate" : "2013-10-15T09:30:06.504Z", "Checksum" : "28C68F52F2177D26053C8C3D62A79FFD", "Inactive" : false } } }

The export runs fine and exports the items however these items do not get imported into the new index. Having investigated this issue, it appears that the document files are generated under the folder hierarchy "testcustomer/testservice/" instead of the structure that the import utilty expects it in. This results in no documents being imported from the exported tar file.

I have tried and index without the forward slashes in the id's and they appear to get exported and imported correctly as per the documentation?

Also it does appear that it creates one file per document inside the tar file.This could result in 1 million files if i had a million items in my index. Is there a way this can be batched into bigger files with multiple documents and maybe a configurable batch size per file?

Not working with 0.90.7

curl -XPOST http://192.168.1.230:9200/logstash-2014.02.19/_export
{"error":"IncompatibleClassChangeError[Found interface org.elasticsearch.rest.RestRequest, but class was expected]","status":500}

Is it just the version is not supported or am I missing something else?

exported files are tiny, and compressed improperly

I'm following the examples in the tutorial to export data:

zck:~$ curl -XDELETE localhost:9200/test
{"ok":true,"acknowledged":true}
zck:~$ curl -XPUT localhost:9200/test/test/1 -d '{"key":"value 1"}'
{"ok":true,"_index":"test","_type":"test","_id":"1","_version":1}
zck:~$ curl -XPUT localhost:9200/test/test/2 -d '{"key":"value 2"}'
{"ok":true,"_index":"test","_type":"test","_id":"2","_version":1}
zck:~$ curl -XPOST localhost:9200/test/test/_export?target=/log/elasticsearch.tar.gz
{"ok":true}
zck:~$

This is fine, but let me now look at the exported file:

zck:~$ ll /log/elasticsearch.tar.gz 
-rw-r--r-- 1 elasticsearch elasticsearch 10 Dec  4 23:58 /log/elasticsearch.tar.gz
zck:~$ tar -xf /log/elasticsearch.tar.gz 

gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now

While I'm at it, let me try the other export compression types.

zck:~$ curl -XPOST localhost:9200/test/test/_export?target=/log/elasticsearch.tar.bz2
{"ok":true}
zck:~$ ll /log/elasticsearch.tar.bz2 
-rw-r--r-- 1 elasticsearch elasticsearch 0 Dec  5 00:02 /log/elasticsearch.tar.bz2
zck:~$ tar -xf /log/elasticsearch.tar.bz2 
tar: This does not look like a tar archive

bzip2: Compressed file ends unexpectedly;
    perhaps it is corrupted?  *Possible* reason follows.
bzip2: Inappropriate ioctl for device
    Input file = (stdin), output file = (stdout)

It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.

You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.

tar: Child returned status 2
tar: Error is not recoverable: exiting now

zck:~$ curl -XPOST localhost:9200/test/test/_export?target=/log/elasticsearch.tar.xz
{"ok":true}
zck:~$ ll /log/elasticsearch.tar.xz 
-rw-r--r-- 1 elasticsearch elasticsearch 12 Dec  5 00:03 /log/elasticsearch.tar.xz
zck:~$ tar -xf /log/elasticsearch.tar.xz 
xz: (stdin): Unexpected end of input
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Not only are none of the files right, but they're all only a few bytes.

Here's what seems to be the relevant part of the elasticsearch log file for the first .tar.gz export file:

2013-12-04 23:58:18,876][INFO ][org.xbib.elasticsearch.action.RestExportAction] [ES12] starting export to /log/elasticsearch.tar.gz
[2013-12-04 23:58:18,877][DEBUG][cluster.service          ] [ES12] processing [cluster_update_settings]: execute
[2013-12-04 23:58:18,877][DEBUG][cluster.service          ] [ES12] cluster state updated, version [108], source [cluster_update_settings]
[2013-12-04 23:58:18,878][DEBUG][river.cluster            ] [ES12] processing [reroute_rivers_node_changed]: execute
[2013-12-04 23:58:18,878][DEBUG][river.cluster            ] [ES12] processing [reroute_rivers_node_changed]: no change in cluster_state
[2013-12-04 23:58:18,879][DEBUG][cluster.service          ] [ES12] processing [cluster_update_settings]: done applying updated cluster_state
[2013-12-04 23:58:18,879][DEBUG][cluster.service          ] [ES12] processing [reroute_after_cluster_update_settings]: execute
[2013-12-04 23:58:18,881][DEBUG][cluster.service          ] [ES12] cluster state updated, version [109], source [reroute_after_cluster_update_settings]
[2013-12-04 23:58:18,881][INFO ][org.xbib.elasticsearch.action.RestExportAction] [ES12] getting settings for index test
[2013-12-04 23:58:18,881][DEBUG][river.cluster            ] [ES12] processing [reroute_rivers_node_changed]: execute
[2013-12-04 23:58:18,881][DEBUG][river.cluster            ] [ES12] processing [reroute_rivers_node_changed]: no change in cluster_state
[2013-12-04 23:58:18,882][DEBUG][cluster.service          ] [ES12] processing [reroute_after_cluster_update_settings]: done applying updated cluster_state
[2013-12-04 23:58:18,882][DEBUG][cluster.service          ] [ES12] processing [cluster_update_settings]: execute
[2013-12-04 23:58:18,882][DEBUG][cluster.service          ] [ES12] cluster state updated, version [110], source [cluster_update_settings]
[2013-12-04 23:58:18,883][DEBUG][river.cluster            ] [ES12] processing [reroute_rivers_node_changed]: execute
[2013-12-04 23:58:18,883][DEBUG][river.cluster            ] [ES12] processing [reroute_rivers_node_changed]: no change in cluster_state
[2013-12-04 23:58:18,883][DEBUG][cluster.service          ] [ES12] processing [cluster_update_settings]: done applying updated cluster_state
[2013-12-04 23:58:18,884][DEBUG][cluster.service          ] [ES12] processing [reroute_after_cluster_update_settings]: execute
[2013-12-04 23:58:18,885][DEBUG][cluster.service          ] [ES12] cluster state updated, version [111], source [reroute_after_cluster_update_settings]
[2013-12-04 23:58:18,886][DEBUG][river.cluster            ] [ES12] processing [reroute_rivers_node_changed]: execute
[2013-12-04 23:58:18,886][DEBUG][river.cluster            ] [ES12] processing [reroute_rivers_node_changed]: no change in cluster_state
[2013-12-04 23:58:18,887][DEBUG][cluster.service          ] [ES12] processing [reroute_after_cluster_update_settings]: done applying updated cluster_state

Mass import is not working

Hello,

I'm trying to move indices between two servers:

First I export each one. (20 indices)
Then import them on the new server using the following:

for f in *; do curl -XPOST localhost:9200/"${f:0:19}"/_import?path=/tmp/extracted/"$f"; done

When checking disk space used by one indice: it's tiny compared to space used on previous server. (like 2 MB versus 50 MB)
Lots of data is missing when querying the indice.

But when I do import each indice one by one, making sure that there is only one import running at a time: space used is now consistent between the two servers.

Example archive name: logstash-2014.03.07.tar.gz

Exporting on ES 0.90.9
Importing on ES 1.0.1 (also tried on 0.90.11 with same issue)

Getting error while trying to export

Hi
I tried to do following.
POST localhost:9200/test/_export
{
"running":true,
"mode":"export",
"archive":"tar",
"path":"/Users/user1/Downloads/elasticsearch-1.1.1/test1.tar.gz"
}

Not sure what i'm missing .

[2014-05-05 14:24:50,646][INFO ][org.xbib.elasticsearch.action.RestExportAction] [Motormouth] params = {index=localhost:9200, type=test}
[2014-05-05 14:24:50,646][WARN ][org.xbib.elasticsearch.action.RestExportAction] [Motormouth] no map defined
[2014-05-05 14:24:50,646][INFO ][org.xbib.elasticsearch.action.RestExportAction] [Motormouth] index/type map = {}
[2014-05-05 14:24:50,646][INFO ][org.xbib.elasticsearch.action.RestExportAction] [Motormouth] start of export: {"started":"2014-05-05T19:24:50.646Z","path":"localhost:9200_test.tar.gz","map":{},"uri":null,"copy":false,"s3":false}
[2014-05-05 14:24:50,651][INFO ][org.xbib.elasticsearch.action.RestExportAction] [Motormouth] getting settings for indices [localhost:9200]
[2014-05-05 14:24:50,651][INFO ][org.xbib.elasticsearch.action.RestExportAction] [Motormouth] found indices: []
[2014-05-05 14:24:50,651][ERROR][org.xbib.elasticsearch.action.RestExportAction] [Motormouth] [localhost:9200] missing
org.elasticsearch.indices.IndexMissingException: [localhost:9200] missing
at org.elasticsearch.cluster.metadata.MetaData.convertFromWildcards(MetaData.java:773)
at org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:661)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.(TransportSearchTypeAction.java:109)
at org.elasticsearch.action.search.type.TransportSearchScanAction$AsyncAction.(TransportSearchScanAction.java:58)
at org.elasticsearch.action.search.type.TransportSearchScanAction$AsyncAction.(TransportSearchScanAction.java:55)
at org.elasticsearch.action.search.type.TransportSearchScanAction.doExecute(TransportSearchScanAction.java:52)
at org.elasticsearch.action.search.type.TransportSearchScanAction.doExecute(TransportSearchScanAction.java:42)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:114)
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:49)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:85)
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:207)
at org.xbib.elasticsearch.action.RestExportAction$ExportThread.run(RestExportAction.java:321)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Allow writing into empty index with _push

_push action into empty index fails. It should be allowed for ease of copying an index.

[2014-10-20 21:21:40,626][ERROR][KnapsackPushAction       ] Failed execution
org.elasticsearch.common.util.concurrent.UncategorizedExecutionException: Failed execution
    at org.elasticsearch.action.support.AdapterActionFuture.rethrowExecutionException(AdapterActionFuture.java:90)
    at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:50)
    at org.xbib.elasticsearch.action.knapsack.push.TransportKnapsackPushAction.performPush(TransportKnapsackPushAction.java:182)
    at org.xbib.elasticsearch.action.knapsack.push.TransportKnapsackPushAction$1.run(TransportKnapsackPushAction.java:108)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.transport.RemoteTransportException: [Honcho][inet[/192.168.1.250:9300]][indices/create]
Caused by: org.elasticsearch.indices.IndexAlreadyExistsException: [newindex] already exists
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validateIndexName(MetaDataCreateIndexService.java:164)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validate(MetaDataCreateIndexService.java:539)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.access$100(MetaDataCreateIndexService.java:89)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:229)
    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:328)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

Import: bulk error, NoNodeAvailableException and missing content

Using ElasticSearch 1.1, we are seeing a problem with the import mechanism and this plugin throwing stacktraces and not successfully inserting all data from the output .tar.gz file.

At the end of the bulk import process, ElasticSearch throws NoNodeAvailableException(s) corresponding to the maxBulkConcurrency setting. The number of records missing in the destination ElasticSearch instance appears to fall below (maxBulkConcurrency * maxActionsPerBulkRequest), so on a 4-threaded configuration with 1000 actions, we could see up to 4000 missing entries.

Stacktrace:

[2014-06-12 20:51:47,637][INFO ][BulkTransportClient      ] closing bulk processor...
[2014-06-12 20:51:47,685][INFO ][BulkTransportClient      ] shutting down...
[2014-06-12 20:51:47,722][ERROR][BulkTransportClient      ] bulk [5626] error
org.elasticsearch.client.transport.NoNodeAvailableException: No node available
    at org.elasticsearch.client.transport.TransportClientNodesService$RetryListener.onFailure(TransportClientNodesService.java:263)
    at org.elasticsearch.action.TransportActionNodeProxy$1.handleException(TransportActionNodeProxy.java:89)
    at org.elasticsearch.transport.TransportService$Adapter$2$1.run(TransportService.java:316)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
[2014-06-12 20:51:47,722][ERROR][BulkTransportClient      ] bulk [5628] error
org.elasticsearch.client.transport.NoNodeAvailableException: No node available
    at org.elasticsearch.client.transport.TransportClientNodesService$RetryListener.onFailure(TransportClientNodesService.java:263)
    at org.elasticsearch.action.TransportActionNodeProxy$1.handleException(TransportActionNodeProxy.java:89)
    at org.elasticsearch.transport.TransportService$Adapter$2$1.run(TransportService.java:316)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
[2014-06-12 20:51:47,732][ERROR][BulkTransportClient      ] bulk [5625] error
org.elasticsearch.client.transport.NoNodeAvailableException: No node available
    at org.elasticsearch.client.transport.TransportClientNodesService$RetryListener.onFailure(TransportClientNodesService.java:263)
    at org.elasticsearch.action.TransportActionNodeProxy$1.handleException(TransportActionNodeProxy.java:89)
    at org.elasticsearch.transport.TransportService$Adapter$2$1.run(TransportService.java:316)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
[2014-06-12 20:51:47,746][ERROR][BulkTransportClient      ] bulk [5627] error
org.elasticsearch.client.transport.NoNodeAvailableException: No node available
    at org.elasticsearch.client.transport.TransportClientNodesService$RetryListener.onFailure(TransportClientNodesService.java:263)
    at org.elasticsearch.action.TransportActionNodeProxy$1.handleException(TransportActionNodeProxy.java:89)
    at org.elasticsearch.transport.TransportService$Adapter$2$1.run(TransportService.java:316)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
[2014-06-12 20:51:47,747][INFO ][BulkTransportClient      ] shutting down completed

Tracking this down through the source for elasticsearch-knapsack and elasticsearch-support, I believe we are running into a "failure to flush" condition when closing out the import, where not all records are successfully written before each thread is terminated.

In src/main/java/org/xbib/elasticsearch/action/RestImportAction.java, after the logger.info("end of import: {}", status); call, a method call is made to bulkClient.shutdown();. This in turn calls BulkTransportClient's super.shutdown(); method, which runs the following code in BaseTransportClient:

    public synchronized void shutdown() {
        if (client != null) {
            client.close();
            client.threadPool().shutdown();
            client = null;
        }
        addresses.clear();
    }

At no point does the plugin or support mechanism appear to wait. My suggestion would be to add a line in RestImportAction.java to perform bulkClient.flush() prior to the shutdown call, or update the elasticsearch-support plugin to ensure that a shutdown waits for pending actions to complete.

got NoSuchMethodError

we've got error with newest elasticsearch...(our ES version: elasticsearch 0.90.5 official rpm)

I think elasticsearch-knapsack haven't keeping up with this commit elastic/elasticsearch@b27e7d3

# try export kibana daily index
curl -XPOST 'http://localhost:9200/logstash-2013.09.17/fluentd/_export?target=/tmp/logstash-2013.09.17.tar.gz'

[2013-09-24 17:04:11,679][INFO ][node                     ] [Hood] version[0.90.5], pid[5672], build[c8714e8/2013-09-17T13:09:46Z]
[2013-09-24 17:04:11,680][INFO ][node                     ] [Hood] initializing ...
[2013-09-24 17:04:11,695][INFO ][plugins                  ] [Hood] loaded [knapsack], sites [head]
[2013-09-24 17:04:13,741][WARN ][monitor.jvm              ] [Hood] ignoring gc_threshold for [ConcurrentMarkSweep], missing warn/info/debug values
[2013-09-24 17:04:13,741][WARN ][monitor.jvm              ] [Hood] ignoring gc_threshold for [ParNew], missing warn/info/debug values
[2013-09-24 17:04:15,394][INFO ][node                     ] [Hood] initialized
[2013-09-24 17:04:15,394][INFO ][node                     ] [Hood] starting ...
[2013-09-24 17:04:15,529][INFO ][transport                ] [Hood] bound_address {inet[/0.0.0.0:9300]}, publish_address {inet[/10.0.247.93:9300]}
[2013-09-24 17:04:18,571][INFO ][cluster.service          ] [Hood] new_master [Hood][6tdLuu4LRGSg0eIU-fMR4g][inet[/10.0.247.93:9300]], reason: zen-disco-join (elected_as_master)
[2013-09-24 17:04:18,619][INFO ][discovery                ] [Hood] elasticsearch/6tdLuu4LRGSg0eIU-fMR4g
[2013-09-24 17:04:18,702][INFO ][http                     ] [Hood] bound_address {inet[/0.0.0.0:9200]}, publish_address {inet[/10.0.247.93:9200]}
[2013-09-24 17:04:18,703][INFO ][node                     ] [Hood] started
[2013-09-24 17:04:20,638][INFO ][gateway                  ] [Hood] recovered [25] indices into cluster_state
[2013-09-24 17:04:43,469][WARN ][monitor.jvm              ] [Hood] [gc][ParNew][28][2] duration [1s], collections [1]/[1s], total [1s]/[1.2s], memory [1.7gb]->[573.3mb]/[3.8gb], all_pools {[Code Cache] [4.1mb]->[4.1mb]/[48mb]}{[Par Eden S
pace] [1.6gb]->[52.1mb]/[1.6gb]}{[Par Survivor Space] [128.6mb]->[204.7mb]/[204.7mb]}{[CMS Old Gen] [0b]->[316.4mb]/[2gb]}{[CMS Perm Gen] [38.9mb]->[38.9mb]/[82mb]}
[2013-09-24 17:06:59,020][WARN ][http.netty               ] [Hood] Caught exception while handling client http traffic, closing connection [id: 0x0b9908ba, /127.0.0.1:58834 => /127.0.0.1:9200]
java.lang.NoSuchMethodError: org.elasticsearch.rest.action.support.RestActions.splitIndices(Ljava/lang/String;)[Ljava/lang/String;
        at org.xbib.elasticsearch.action.RestExportAction.handleRequest(RestExportAction.java:67)
        at org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
        at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
        at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
        at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
        at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:291)
        at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
        at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
        at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
        at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
        at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
        at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

Not able to use _import action in windows environment

Try to do the following thing in Windows command line:
D:>curl -XPOST "localhost:9200/_import"
{"running":true,"mode":"import","type":"tar","path":"file:_all.tar.gz"}

Got errors in ES console, looks like issue caused by File.separator :

Unexpected internal error near index 1

^
java.util.regex.PatternSyntaxException: Unexpected internal error near index 1

^
at java.util.regex.Pattern.error(Pattern.java:1924)
at java.util.regex.Pattern.compile(Pattern.java:1671)
at java.util.regex.Pattern.(Pattern.java:1337)
at java.util.regex.Pattern.compile(Pattern.java:1022)
at java.lang.String.split(String.java:2313)
at java.lang.String.split(String.java:2355)
at org.xbib.elasticsearch.plugin.knapsack.KnapsackPacket.decodeName(Knap
sackPacket.java:59)
at org.xbib.elasticsearch.action.RestImportAction$ImportThread.run(RestI
mportAction.java:228)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:615)
at java.lang.Thread.run(Thread.java:724)

working with elasticsearch 0.90.9

I just installed the latest version of elasticsearch 0.90.9. After running the post command:
curl -XPOST localhost:9200/logstash-2013.12.17/_export?target=/mnt/backup/logstash-20131217.tar.gz

I found these logs from elasticsearch:
[2013-12-31 01:52:32,387][INFO ][org.xbib.elasticsearch.action.RestExportAction] [log-es3] starting export to /mnt/backup/logstash-20131217.tar.gz
[2013-12-31 01:52:32,565][INFO ][org.xbib.elasticsearch.action.RestExportAction] [log-es3] getting settings for index logstash-2013.12.17

but there is no data in the target file:
-rw-r--r-- 1 elasticsearch elasticsearch 10 Dec 31 01:52 logstash-20131217.tar.gz

Is that a compatible issue ?

Input / output encoding is not handled at all, export/import from jvm with different encoding corrupt text

I meet the issue after I export from an index from Elasticsearch launched with UTF-8 charset and I import the archive from an Elasticsearch launched with CP1252 charset.

The imported index is corrupted.

Support exporting of all stored fields for search results

According to documentation, ES supports retrieving all stored fields in search results using "*", so this request should work:

curl -XPOST 'localhost:9200/test/test/_export' -d '{
   "query" : {
       "match_phrase" : {
           "key" : "value 1"
       }
   },
   "fields" : [ "*" ]
}'

But it doesn't, seems like because knapsack is using GET API which actually doesn't support this feature.
We can probably switch to using POST API in order to support this, what do you think?

_export with target specified returns a status code of 0 and exception in logs with 2.1.4 (0.90.3 ES)

Created a index "test" with two documents in and having tried to export
using the following url http://localhost:9200/test/_export?target=/big/space/aive2.tar.gz, I noticed the following exception in the logs having been returned a status code of 0. I can see a empty tar file being created which appears to be corrupt as well.

java.lang.NoSuchMethodError: org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.(Ljava/io/OutputStream;Ljava/lang/String;)V
at org.xbib.io.commons.TarSession.open(TarSession.java:102)
at org.xbib.elasticsearch.action.RestExportAction.handleRequest(RestExportAction.java:103)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:290)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:43)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

Import target parameter fails when a file extension is provided

When one specifies target when importing the plugin will fail if a file extension is provided

Specifying target=/path/to/archive.tar.gz produces

SEVERE: [node-test-1379778841182] check existence or access rights: /path/to/archive.tar.gz.tar.gz
java.io.FileNotFoundException: check existence or access rights: /path/to/archive.tar.gz.tar.gz
    at org.xbib.io.commons.TarSession.open(TarSession.java:65)
    at org.xbib.elasticsearch.action.RestImportAction.handleRequest(RestImportAction.java:99)

It looks as though it tries to append the file extension after determining the scheme.

Tar.gz file is corrupted

Export is not working on my 0.90.1 . tar.gz file is only 10 bytes and when i try to import it i get

[2013-06-14 09:00:06,381][ERROR][rest.action ] [dba_node1] Unexpected end of ZLIB input stream
java.io.EOFException: Unexpected end of ZLIB input stream
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:116)
at org.xbib.io.tar.TarBuffer.readBlock(TarBuffer.java:156)
at org.xbib.io.tar.TarBuffer.readRecord(TarBuffer.java:191)
at org.xbib.io.tar.TarInputStream.getNextEntry(TarInputStream.java:186)
at org.xbib.io.tar.TarEntryReadOperator.read(TarEntryReadOperator.java:52)
at org.xbib.io.tar.TarSession.read(TarSession.java:190)
at org.elasticsearch.rest.action.RestImportAction$1.run(RestImportAction.java:127)
at java.lang.Thread.run(Thread.java:722)

Seems that file is corrupted

0.19.8 missing

The zip file for the plugin of this version is missing.

Rename target parameter to source when importing

Currently, when importing, the parameter used to specify the import location is called target. It should probably be renamed source in this endpoint.

Error when exporting from Elasticsearch 0.90.10

Hello! I'm receiving the following error when running knapsack on a fresh install of Elasticsearch 0.90.10:

{"error":"IncompatibleClassChangeError[Found class org.elasticsearch.rest.RestRequest, but interface was expected]","status":500}

Steps to reproduce:

Install Elasticsearch 0.90.10
curl -XPUT localhost:9200/test/test/1 -d '{"key":"value 1"}'
curl -XPUT localhost:9200/test/test/2 -d '{"key":"value 2"}'
curl -XPOST localhost:9200/test/test/_export
Error from above is thrown

Any help would be appreciated.

Thanks!

Support for 0.90.7

Error received when trying to export from latest version.

{"error":"ElasticSearchTimeoutException[Timeout waiting for task.]","status":500}

Install link down

The http://xbib.org link for install is down!

How can i use this with java-api

Is there any way to use this with java api.

Importing full index settings+data not working

I export the hole index (it contains 3 document types) by performing the following command:

curl -XPOST localhost:9200/my_index/_export

it works great, all data+settings are exported, but, when I try to import the index (settings+data) only the index is created, but no data is indexed.

curl -XPOST localhost:9200/test/_import

Export: fail fast on bad parameters

I think this plugin could be greatly improved by failing fast on the following checks, before asynchronously returning:

Cluster 'yellow' check
ConnectionFactory factory = service.getConnectionFactory(scheme);
Connection<Session> connection = factory.getConnection(URI.create(scheme + ":" + target));

These operations should be moved out of the spawned thread, and any exceptions caught and returned before signaling ok.