Giter VIP home page Giter VIP logo

elasticsearch-logstash-s3-backup's Introduction

aptible/elasticsearch-logstash-s3-backup

An Aptible app that periodically archives older Logstash indexes from an Elasticsearch database. Indexes named in the standard logstash convention (logstash-YYYY.MM.DD) are archived daily based on their age, keeping a configurable number of the most recent indexes live. Each index is backed up over HTTPS and encrypted using AWS server-side encryption to its own Elasticsearch snapshot in your S3 snapshot repository.

Installation

Recommended setup for running as an app on Aptible:

  1. Create an S3 bucket for your logs in the same region as your Elasticsearch database.

  2. Create an IAM user to run the backup and restore. Give your IAM user permission to read/write from the bucket you created in step 1. The elasticsearch-cloud-aws plugin documentation has instructions on setting up these permissions; adding the following as an inline custom policy should be sufficient (replace BUCKET_NAME with the name of your S3 bucket):

    {
        "Statement": [
            {
                "Action": [
                    "s3:ListBucket",
                    "s3:GetBucketLocation",
                    "s3:ListBucketMultipartUploads",
                    "s3:ListBucketVersions"
                ],
                "Effect": "Allow",
                "Resource": [
                    "arn:aws:s3:::MY_BUCKET_NAME"
                ]
            },
            {
                "Action": [
                    "s3:GetObject",
                    "s3:PutObject",
                    "s3:DeleteObject",
                    "s3:AbortMultipartUpload",
                    "s3:ListMultipartUploadParts"
                ],
                "Effect": "Allow",
                "Resource": [
                    "arn:aws:s3:::MY_BUCKET_NAME/*"
                ]
            }
        ],
        "Version": "2012-10-17"
    }
    
    
  3. Create an app in your Aptible account for the cron. You can do this through the Aptible dashboard or using the Aptible CLI:

    aptible apps:create YOUR_APP_HANDLE
    

    In the steps that follow, we'll use <YOUR_APP_HANDLE> anywhere that you should substitute the actual app handle you've specified in this step.

  4. Set the following environment variables in your app's configuration:

    • DATABASE_URL: Your Elasticsearch URL.
    • S3_BUCKET: Your S3 Bucket name.
    • S3_BUCKET_BASE_PATH: Destination path within bucket (Optional)
    • S3_ACCESS_KEY_ID: The access key you generated in step 2.
    • S3_SECRET_ACCESS_KEY: The secret key you generated in step 2.

    You may also wish to override any of the following optional environment variables:

    • MAX_DAYS_TO_KEEP: The number of days of live logstash indexes you'd like to keep in your Elasticsearch instance. Any indexes from before this point will be archived by the cron. Defaults to 30.
    • CRON_SCHEDULE: The schedule for your backups. Defaults to "0 2 * * *", which runs nightly at 2 A.M. Make sure to escape any asterisks when setting this variable from the command line to avoid shell expansion.
    • REPOSITORY_NAME: The name of your Elasticsearch snapshot repo. This is the handle you can use in Elasticsearch to perform operations on your snapshot repo. Defaults to "logstash-snapshots".
    • WAIT_SECONDS: Number of seconds to wait on an index archive to succeed after the request has been made. Defaults to 1800.

    Environment variables can be set using the Aptible CLI, for example:

    aptible config:set  --app YOUR_APP_HANDLE NAME=VALUE OTHER_NAME=OTHER_VALUE
    
  5. Clone this repository and push it to your Aptible app:

    git clone https://github.com/aptible/elasticsearch-logstash-s3-backup.git
    cd elasticsearch-logstash-s3-backup
    git remote add aptible [email protected]:<YOUR_APP_HANDLE>.git
    git push aptible master
    

Notes

The cron run by this app will execute daily and log its progress to stdout.

To test the backup or run it manually, use the Aptible CLI to run the backup-all-indexes.sh script in a container over SSH:

aptible ssh --app YOUR_APP_HANDLE ./backup-all-indexes.sh

To restore an index, you can use the restore-index.sh script included in this repository with the Aptible CLI. For example, to load the index for July 10, 2015, run:

aptible ssh --app YOUR_APP_HANDLE ./restore-index.sh logstash-2015-07-10

This will load the index back into your Elasticsearch instance. Note that if you load an archived index into the same instance that you are running this cron against, the index will get removed at the end of the day. Alternatively, you can load the index into a different Elasticsearch instance by overriding the DATABASE_URL environment variable in an SSH session before you run the restore script:

$ aptible ssh --app YOUR_APP_HANDLE bash
bash-4.3# DATABASE_URL=https://some-other-elasticsearch ./restore-index.sh logstash-2015-07-09

Copyright and License

MIT License, see LICENSE for details.

Copyright (c) 2019 Aptible and contributors.

elasticsearch-logstash-s3-backup's People

Contributors

krallin avatar aaw avatar usernotfound avatar lorentzlasson avatar lukeasrodgers avatar matthewrfindley avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.