🗃 Open Distro Index Management

Home Page: https://opendistro.github.io/

License: Apache License 2.0

Kotlin 99.91% Python 0.09%

index-management's People

Contributors

Stargazers

Watchers

index-management's Issues

Support Elasticsearch 7.2

To be done at the end of development/before initial release.

Query Action

During the community meeting, someone asked if it was possible to have an action to modify the data in the index by using a query to delete documents over a certain threshold etc.

Since this involves accessing data it has more security implications that would need to be thought out.

Will use this issue to track it + see if other people are interested.

[BUG] Update policy doesn’t work

Describe the bug
Hello.
It looks like update policy REST API doesn’t work.
I have an index with a policy named “ingest_policy”. And the policy was updated recently.
When I tried to update the index via REST I got an error:
{ “error”: “no handler found for uri [/_opendistro/_ism/update_policy/logging-price-2019.11.12?pretty] and method [POST]” }

Here is the query that I ran:

POST _opendistro/_ism/update_policy/logging-price-2019.11.12
{ “policy_id”: “ingest_policy”}

At the same time it works fine via UI
I’m running OpenDistro 1.3

Error updating policy via ISM API

Description
When updating a policy via the ISM API, the following error is thrown -

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Missing policy ID"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "Missing policy ID"
  },
  "status" : 400
}

Both the '_id' and 'policy_id' fields are provided in the request body and the "_seq_no" and "_primary_term" match the values in the current policy which is the target of the update. An example of an attempt to update the policy (closely following examples from the Index State Management documentation online) is included below.

Package versions

elasticsearch-oss 7.3.2
opendistro-alerting 1.3.0.1-1
opendistro-index-management 1.3.0.1-1
opendistro-job-scheduler 1.3.0.0-1
opendistro-performance-analyzer 1.3.0.0-1
opendistro-security 1.3.0.0-0
opendistro-sql 1.3.0.0-1
opendistroforelasticsearch 1.3.0-1

To Reproduce

Create a new policy

#  curl -H "Content-Type: application/json" -u user:pass -XPUT https://<API ADDRESS>:9200/_opendistro/_ism/policies/ingest_policy?pretty -d @ingest_policy.json 

RESPONSE:                                                                                                                                                                                       
{                                                                                                                                                                                           
  "_id" : "ingest_policy",
  "_version" : 1,
  "_primary_term" : 1,
  "_seq_no" : 7,
  "policy" : {
    "policy" : {
      "policy_id" : "ingest_policy",
      "description" : "ingesting logs",
      "last_updated_time" : 1577978895023,
      "schema_version" : 1,
      "error_notification" : null,
      "default_state" : "ingest",
      "states" : [
        {
          "name" : "ingest",
          "actions" : [
            {
              "rollover" : {
                "min_doc_count" : 5
              }
            }
          ],
          "transitions" : [
            {
              "state_name" : "search"
            }
          ]
        },
        {
          "name" : "search",
          "actions" : [ ],
          "transitions" : [
            {
              "state_name" : "delete",
              "conditions" : {
                "min_index_age" : "5m"
              }
            }
          ]
        },
        {
          "name" : "delete",
          "actions" : [
            {
              "delete" : { }
            }
          ],
          "transitions" : [ ]
        }
      ]
    }
  }
}

Update the policy

#  curl -H "Content-Type: application/json" -u user:pass -XPUT https://<API ADDRESS>:9200/_opendistro/_ism/policies?pretty -d '
 {
  "_id": "ingest_policy",
  "_version": 1,
  "_primary_term": 1,
  "_seq_no": 7,
  "policy": {
    "policy_id": "ingest_policy",
    "description": "ingesting logs - updated version",
    "default_state": "ingest",
    "states": [
      {
        "name": "ingest",
        "actions": [
          {
            "rollover": {
              "min_doc_count": 5
            }
          }
        ],
        "transitions": [
          {
            "state_name": "search"
          }
        ]
      },
      {
        "name": "search",
        "actions": [],
        "transitions": [
          {
            "state_name": "delete",
            "conditions": {
              "min_index_age": "10m"
            }
          }
        ]
      },
      {
        "name": "delete",
        "actions": [
          {
            "delete": {}
          }
        ],
        "transitions": []
      }
    ]
  }
}
'
RESPONSE: 
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Missing policy ID"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "Missing policy ID"
  },
  "status" : 400
}

Expected behavior
I would expect the policy to be updated but instead see an error message stating that the 'policy_id' is missing.

Have I missed a trick here or is this unexpected behaviour?

Cheers

Better messaging about current status/info

As described here: https://discuss.opendistrocommunity.dev/t/stuck-in-attempting-to-transition/2212

User got confused about what the "Attempting to transition" message meant. It's not really clear to someone new to the plugin, and we can definitely alleviate that.

Possible improvements include:

Change the wording
Add which conditions were checked and what their current values are to know why it's still waiting

Alias Action

During the community meeting, someone asked if it was possible to have an action to add an alias to the index.
Will use this issue to track it + see if other people are interested.

Support Elasticsearch 6.8

Any chance index-management will be available for 6.8 or can be back ported?

Implement allocation action

Expected Behavior

Our opendistro cluster has two data node types: hot and warm. For each index we have varied retention times for hot and warm boxes so we using curator for this task gets tiresome. We'd like to use ILM for that task.

We expect that policy for allocate will look like:

{
    "allocation": {
        "require": { "box_type": "hot" }
    }
}

It's also a good idea to provide include and exclude filters.

Current Behavior

allocate action is not implemented

Transactions around step execution

If we successfully execute a step, but fail to update the ManagedIndexMetaData after then we can end up executing the same step which might not be idempotent/safe to do. So we need to add transactions to the execution phase so that we know if we attempted to execute or not.

Notifications

Incorporate Notifications subproject from alerting into Index Management

See future actions/steps that will be executed

During the community meeting, someone asked if it was possible to see which actions/steps will be executed in the future instead of having to manually check the policy to see what's next.

The only tricky part with this is once an index is in the transition phase and it has multiple states it can possibly go to, then we don't know where it will go. If it's in the beginning/middle of a state though, since it's linear we definitely know what's next.

Will use this issue to track it + see if other people are interested.

Implement Action Retry Logic

Speed up tests

Currently we have Thread.sleeps in our integ tests since we are waiting on asynchronous code in our execution. We are just setting a high number usually (5 seconds) to be sure it'll have enough time. We should add some helper util methods assert on some calls and retry up to some time, so if it passes faster than 5 seconds we can move on.

Question - applying policies to multiple indices using regex match

Hi there,

Is it possible to apply policies to multiple indices using a regex match? I see that policies can be applied to index pattern names using a wildcard but am unsure how to apply a policy using a more fine grained index pattern match.

Eg -

Assume I wanted to update the policies for all 'index-pattern-YYYY.MM.DD' indices but not for 'index-pattern-*-YYYY.MM.DD' and ran the following -

curl  -X POST_opendistro/_ism/change_policy/index-pattern-* -d @change-policy.json 

----
change-policy.json
{                                                                                                                              
        "policy_id": "new-policy"                                                                                                                                                                               
}

This would update all indices starting with index-pattern of course, including the ones I would not like to match.
Is there any filter that could be placed in the body of the json request that would restrict the change_policy operation to indices matching a regex match?

Cheers

Add "snapshot" to supported ISM operations

Support the ability to snapshot an index during a state change. This would be the same functionality found in the curator snapshot action found here: https://www.elastic.co/guide/en/elasticsearch/client/curator/current/snapshot.html

Implement Stop ISM API

Support Elasticsearch 7.1

To be done at the end of development/before initial release.

Rollover API misalignment

Expected Behavior

When providing ILM with policy configured with standard Elastic rollover API variables I expect it to parse them properly such as:

"actions": [
    {
        "rollover": {
            "max_docs": 1000,
            "max_age": "1h",
            "max_size": "40gb"
        }
    }
]

Current Behavior

Opendistro ILM expects different variables, such as min_size,min_index_age and min_doc_count. Not only are they inconsistent in ILM (min_size instead of min_index_size) they are misaligned with elastic API (min vs max)

Possible Solution

Align current Opendistro ILM variables with elastic ones.

Context

This issue arose during transferring our ILM policies from elastic to opendistro. Translating those variables to opendistro standard would force us to provide further documentation on rolling APi instead of pointing to official docs.

Improve Add Policy API by using transport action instead of timeout

The initial implementation of the Add Policy API had a concerning case where two requests to add policy could overwrite each other. The concern was brought up here.

As a temporary fix, a timeout of 30 seconds was added to the Add Policy API action. As a more permanent fix, a transport action should perform the logic of iterating through the cluster state and updating the indices' settings on the master node itself, preventing requests to add policy both returning 200 status and one overwriting the other.

Show conditions in info object for transitions and rollover actions

Currently we only show information like:
"Attempting to transition"
"Attempting to rollover"
"Rolled over index"
"Transitioning to "

This leads to these actions behaving like a black box in terms of what information they are acting on.
The default rollover API has to be called and will return the results of the evaluation (true/false) for each condition specified.

Suggesting we add something like below to the "info" map for each of these steps that are attempting to rollover or transition.

"info": {
    "message": "Attempting to rollover",
    "conditions": {
        "min_index_age": {
            "value": false
            "condition": "7d"
            "current": "6.35d"
        },
        "min_size": {
            "value": false
            "condition": "25gb"
            "current": "20.46gb"
        },
    }
}

Keep elastic API contracts regarding GET/PUT actions

Expected Behavior

GET _opendistro/_ism/policies returns list of created policies

Current Behavior

{
  "error": "Incorrect HTTP method for uri [/_opendistro/_ism/policies?pretty] and method [GET], allowed: [PUT]",
  "status": 405
}

Documentation improvement: implications of actions

Is your feature request related to a problem? Please describe.
I am currently designing the policies for an elastic cluster and see several actions where I do not fully understand what implications they have. E.g. the close action closes the index (ok - reasonable) - but is a search after that possbile (the index is reopened on demand and closed again) or is it intended as an archive which has to be manually transitioned if needed

Describe the solution you'd like
I would like to have in https://opendistro.github.io/for-elasticsearch-docs/docs/ism/policies/#actions for each action a clear description

what the action does
what it implies (e.g. on search the index is first automatically opened and loaded leading to a delay of some seconds / a search is not possible. To be searchable again it has to be manually reopend by the open action via an e.g. manually triggered transition)
what are typical case to use this action

Describe alternatives you've considered
Alternative would be to do some deep tests or try to find it out in the source code

Additional context
This request targets only at the documentation and does not include requests for code changes

Previous action was not able to update IndexMetaData

Hello, I noticed that several indexes have status Failed with the error: "Previous action was not able to update IndexMetaData". I think it happens after data nodes restart, but not sure.
Is there any way to configure automatic retry for such error.
My policy is below:
{ "policy": { "policy_id": "ingest_policy", "description": "Default policy", "last_updated_time": 1574686046552, "schema_version": 1, "error_notification": null, "default_state": "ingest", "states": [ { "name": "ingest", "actions": [], "transitions": [ { "state_name": "search", "conditions": { "min_index_age": "4d" } } ] }, { "name": "search", "actions": [ { "timeout": "2h", "retry": { "count": 5, "backoff": "constant", "delay": "1h" }, "force_merge": {"max_num_segments": 1 } } ], "transitions": [ { "state_name": "delete", "conditions": {"min_index_age": "30d"} } ] }, { "name": "delete", "actions": [ { "timeout": "2h", "retry": { "count": 5, "backoff": "constant", "delay": "1h" }, "delete": {} } ], "transitions": [] } ] } }

Elastic don't start after install ISM

I install ISM, everything it was ok i tried to upgrade the elastic version then i go back to 7.0.1 and i get this error from elastic

Exception in thread "main" ElasticsearchException[java.io.IOException: failed to read /var/lib/elasticsearch/nodes/0/indices/B20JPahlQ9y6vD3i2VCLGg/_state/state-13.st]; nested: IOException[failed to read /var/lib/elasticsearch/nodes/0/indices/B20JPahlQ9y6vD3i2VCLGg/_state/state-13.st]; nested: IllegalStateException[Can't get text on a VALUE_NULL at -1:7812];
at org.elasticsearch.ExceptionsHelper.maybeThrowRuntimeAndSuppress(ExceptionsHelper.java:165)
at org.elasticsearch.gateway.MetaDataStateFormat.loadGeneration(MetaDataStateFormat.java:414)

Shrink number of shards action in Index Management

I would like to place a feature request to add Shrink Index or reduce Number of Shards in the Index Management actions

Just like the Force merge, Read, Replica Count etc. it would be great to have Shrink index, to reduce the number of shards in the actions section. This will be a good feature to have for Managed indices

Currently its really hard to manage the indices as they are managed with setting the number of shards at the creation of index and only increasing the number of replicas. I am currently using managed index only for small indices. For larger indices, the indices are managed outside of Index Management by creating new weekly index with less number of shards and reindex all the daily indices in the week into the weekly index

Thanks

Index not rolling on min_size

Hey,

We're experiencing an issue with our ILM configuration. Our current index has over 40GB of data

health status index                                           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   test-000001 kc1l8oBFS8-LXnBrQNWIng   3   0   11415100            0      242gb          242gb

and our ILM policy is defined as

{
    "policy_id": "test",
    "description": "Policy for application: test",
    "last_updated_time": 1574861831914,
    "schema_version": 1,
    "error_notification": null,
    "default_state": "Hot",
    "states": [
        {
            "name": "Hot",
            "actions": [
                {
                    "rollover": {
                        "min_size": "40gb",
                        "min_index_age": "2d"
                    }
                }
            ],
            "transitions": [
                {
                    "state_name": "Warm",
                    "conditions": {
                        "min_index_age": "5d"
                    }
                }
            ]
        },
        {
            "name": "Warm",
            "actions": [
                {
                    "allocation": {
                        "require": {
                            "box_type": "warm"
                        },
                        "wait_for": false
                    }
                }
            ],
            "transitions": []
        }
    ]
}

It is able to rollover every two days but fails to rollover on increased size. Did we misconfigure something? This index is in "Attempting to rollover" state constantly.

Implement ISM status API

Cann't register repository in S3

I need register my bucket s3. Attached this code in devtools on kibana and shows this result. Waht

Implement Force Merge action

Implement Snapshot action

Implement Replica Count action

Action timeout [BUG]

Describe the bug
I faced with Action timeout again.
Here is my policy:
{ "policy_id": "standard_policy", "description": "Default policy", "last_updated_time": 1584712848745, "schema_version": 1, "error_notification": null, "default_state": "rollover", "states": [{ "name": "rollover", "actions": [{ "timeout": "24h", "retry": { "count": 20, "backoff": "constant", "delay": "1h" }, "rollover": { "min_size": "25gb", "min_index_age": "7d" } }], "transitions": [{ "state_name": "search", "conditions": { "min_index_age": "7d" } }] }, { "name": "search", "actions": [], "transitions": [{ "state_name": "delete", "conditions": { "min_index_age": "37d" } }] }, { "name": "delete", "actions": [{ "timeout": "24h", "retry": { "count": 20, "backoff": "constant", "delay": "1h" }, "delete": {} }], "transitions": [] }] }
and despite the timeout configuration, I've got Action timeout for several indices that use that policy. And it looks like the ssytem didn't perform any retries:
here is explain output
{ "logging-time-000001" : { "index.opendistro.index_state_management.policy_id" : "standard_policy", "index" : "logging-time-000001", "index_uuid" : "bUtoGJeyReeVKv3z1YCooA", "policy_id" : "standard_policy", "policy_seq_no" : 25472, "policy_primary_term" : 93, "rolled_over" : false, "state" : { "name" : "rollover", "start_time" : 1584715154862 }, "action" : { "name" : "rollover", "start_time" : 1584715454733, "index" : 0, "failed" : true, "consumed_retries" : 0, "last_retry_time" : 0 }, "retry_info" : { "failed" : false, "consumed_retries" : 0 }, "info" : {"message" : "Action timed out"} } }

Desktop (please complete the following information):

Opendistro Version 1.4.0

Need to be able to collect execution metrics

opendistro-for-elasticsearch/job-scheduler#6

Add Open Distro Rollup Support

Requesting rollup plugin in Open Distro that extends the capability currently offered by Elastic X-Pack Rollups. Suggest the proposed approach utilize a Open Distro Job Scheduler extension as the mechanisim for defining, scheduling and exceuting rollup actions, and that Open Distro SQL be utilized (potential modification to this plugin required) to generate composite aggregations that would be necessary to generate usable rollups. Request that the SQL aggregation query function support aggregations across one or more rollup indexes in addition to one or more search level indexes.

Build in ttl for index rule

As an enhancement i would like to ask for a build in ttl management tool for each index.
The purpose of this would be to easily be able to set how mucho you would want to retain index data for each index.

Action timed out

Describe the bug
Hello
I have several indices with Action timed out.
But when I executed explain I found that consumed_retries = 0. Does it mean that Action was tried only 1 time, or consumed_retries parameter means something different?
Here is the state of the index:

Here is output for _ism/explain:

{
“logging-time-2019.10.28” : {
“index.opendistro.index_state_management.policy_id” : “ingest_policy”,
“index” : “logging-time-2019.10.28”,
“index_uuid” : “Qy2JLOzJScG_xha1XowwFw”,
“policy_id” : “ingest_policy”,
“policy_seq_no” : 2613,
“policy_primary_term” : 56,
“transition_to” : “delete”,
“retry_info” : {“failed” : false, “consumed_retries” : 0 },
“info” : { “message” : “Action timed out” }
}
}

and here is the policy itself:

{
“policy_id”: “ingest_policy”,
“description”: “Default policy”,
“last_updated_time”: 1574536845562,
“schema_version”: 1,
“error_notification”: null,
“default_state”: “ingest”,
“states”: [{
“name”: “ingest”,
“actions”: [],
“transitions”: [{ 
“state_name”: “search”,
“conditions”: { “min_index_age”: “4d” }
}]
},
{
“name”: “search”,
“actions”: [{
“timeout”: “1h”,
“retry”: {“count”: 3,“backoff”: “constant”,“delay”: “1h”},
“force_merge”: { “max_num_segments”: 1 }
}],
“transitions”: [{
“state_name”: “delete”,
“conditions”: { “min_index_age”: “60d” }
}]
},
{
“name”: “delete”,
“actions”: [{
“timeout”: “1h”,
“retry”: {“count”: 3,“backoff”: “constant”,“delay”: “1h”},
“delete”: {}
}],
“transitions”: []
}
]}

[RFC] Index Management

The purpose of this issue is to capture feedback and comments regarding the project's request for comments.

Automatic retry for failed status

Is it possible to configure some automatic retry mechanism for managed indices that are in the Failed state?

Multi-node integration tests

We currently have Allocation PR blocked because there are no tests written.

Developer can't write tests because we do not have multi node integration test support currently.
We have seen issues in other plugins where they pass integration tests for single node cluster and fail for multi-node, so ideally we should have our integration tests run through both setups.

index_management plugin 1.3.0.0 is not compatible with ODFE1.2.x

It looks like index_management plugin 1.3.0.0 is not compatible with ODFE1.2.x
Tried both ODFE 1.2.0 and 1.2.1 with the same result.
Does it make sense to try to rebuild a plugin or ODFE 1.3 is going to be released soon?
Here is an error I got when tried to build a docker:

Step 1/3 : FROM amazon/opendistro-for-elasticsearch:1.2.0
1.2.0: Pulling from amazon/opendistro-for-elasticsearch
...
Digest: sha256:5ebf61c123315627934a381a07eea6482cbff69883de821306da19a4f98eb057
Status: Downloaded newer image for amazon/opendistro-for-elasticsearch:1.2.0
 ---> 79f55ffa040a
Step 2/3 : COPY ./opendistro_index_management-1.3.0.0.zip /tmp
 ---> c268405a8707
Step 3/3 : RUN bin/elasticsearch-plugin install file:///tmp/opendistro_index_management-1.3.0.0.zip --batch
 ---> Running in 76718678c6f7
-> Downloading file:///tmp/opendistro_index_management-1.3.0.0.zip
Exception in thread "main" java.lang.IllegalArgumentException: Plugin [opendistro_index_management] was built for Elasticsearch version 7.3.2 but version 7.2.0 is running
     at org.elasticsearch.plugins.PluginsService.verifyCompatibility(PluginsService.java:346)

Build tasks

Create Gradle build tasks

[BUG] PR with no license headers didn't fail checks

Describe the bug
This PR had some files with missing license headers. It should fail an automatic check instead of relying on human verification.

Expected behavior
Run ./gradlew build with a source file w/ no License Header and fail.

Implement Webhook action

Implement recovery priority action

Expected Behavior
Our opendistro cluster has two data node types: hot and warm.

The Index lifecycle tool from Elastic allows the setting of a recover priority with “set_priority”. https://www.elastic.co/guide/en/elasticsearch/reference/master/_actions.html

Current Behavior
priority action is not implemented

Enable/Disable Index Management

We need to be able to disable Index Management in job scheduler
opendistro-for-elasticsearch/job-scheduler#3

Change policyCompleted in ManagedIndexMetaData to be non-nullable

As the null value for policyCompleted offers no real benefit, it is better to keep the boolean non-nullable for condition checking.

Create tests for isIdempotent

#165

Close Action - validation/contraints/stats

Currently when we do a Close action, we still allow you to define any actions after + transitions with conditions. We need to provide some validation framework to disallow certain policy configurations that do not make any sense (i.e. you cannot do this to a closed index, so don't even allow it). This can lead to a larger policy validation framework in general, but we have known issues with Close.

We also currently do IndexStats requests in transition which throws an error on closed index. So we need to a) not allow certain conditions (since doc count and size should never increase for a closed index) and b) figure out how to get index_age from a closed index.

opendistro-for-elasticsearch / index-management Goto Github PK

index-management's People

Contributors

Stargazers

Watchers

Forkers

index-management's Issues

Expected Behavior

Current Behavior

Expected Behavior

Current Behavior

Possible Solution

Context

Expected Behavior

Current Behavior

Recommend Projects

Recommend Topics

Recommend Org