Giter VIP home page Giter VIP logo

ansible-samvera's Introduction

ansible-samvera

Configuration Management Scripts for Samvera based servers

Apache 2.0 License

This repository provides predefined ansible roles to install and configure a typical set of dependencies required to run a Hyrax-based repository, including:

  • Solr
  • Fedora
  • Postgres SQL
  • Apache webserver
  • Ruby
  • FITS
  • FFMpeg and it's dependencies
  • ImageMagick and it's dependencies

Operating systems

Major version numbers of releases in this repoitory target the corresponding Ubuntu release. I.E. v20.1.1 is compatible witht the 20.x LTS Ubuntu server release.

Prerequisites

Tested with Ansible 2.11.4. Please see the official Installing Ansible documentation for details on installing or upgrading Ansible.

Contributing

Contributions are welcome in the form of issues (including bug reports, use cases) and pull requests.

Origins

This Ansible project was preceeded by ansible-hydra, a project created by Data Curation Experts for the Chemical Heritage Foundation.

Usage

Here is an example playbook that uses these roles (plus a few extra internal-to-dce roles) to build a single box running all the Samvera components:

- hosts: '{{ host }}'
  name: ensure python is installed for ansible
  user: ubuntu
  gather_facts: false
  pre_tasks:
      - name: update apt-get
        raw: sudo apt-get update
      - name: install python for ansible
        raw: sudo apt-get -y install python-simplejson

- hosts: '{{ host }}'
  name: configure server
  user: ubuntu
  gather_facts: true
  vars:
    keys_to_add:
      - https://github.com/user1.keys
      - https://github.com/user2.keys
  roles:
    - { role: packages }
    - { role: set_timezone, timezone: America/Chicago }
    - { role: set_hostname, hostname: stage-demo }
    - { role: sshd_config }
    - { role: setup_logrotation }
    - { role: clamav }
    - { role: nrpe, nrpe_version: '3.1.1', nagios_plugins_version: '2.2.1' }
    - { role: ruby, ruby_version: '2.4.2', ruby_sha_256: '93b9e75e00b262bc4def6b26b7ae8717efc252c47154abb7392e54357e6c8c9c' }
    - { role: postgres }
    - { role: fedora }
    - { role: solr, solr_version: '6.6.2' }
    - { role: pip }
    - { role: fits, fits_version: '0.8.4' }
    - { role: apache, passenger_ver: '5.1.11'}
    - { role: apache_with_mod_ssl }
    - { role: capistrano_setup }
    - { role: dotenv, hostname: stage-demo }
    - { role: sidekiq }
    - { role: imagemagick, imagemagick_ver: '7.0.7', gs_ver: '9.19', openjpg_ver: '2.1.0', libtiff_ver: '4.0.5', libpng_ver: '1.6.28' }
    - { role: ffmpeg, ffmpeg_version: '3.4' }
    - { role: solr-schema }
    - { role: first_deploy }
    - { role: dce_ssl }
    - { role: restart }
    - { role: splunkuforwarder }

An observation for minimal test installs: these packages install successfully on an AWS ubuntu server with 4 gigabytes of RAM (t3a.medium), but not on one with 1 gigabyte of RAM (a free tier "micro" server).

ansible-samvera's People

Contributors

bess avatar fnibbit avatar ksclarke avatar little9 avatar mark-dce avatar maxkadel avatar mlooney avatar oraposa avatar solomonhd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ansible-samvera's Issues

Add cron job to remove old files from /tmp

Run this cron job daily to remove any files owned by the deploy user that are older than 7 days. Otherwise /tmp fills up with imagemagick temp files.

sudo find /tmp -type f -mtime +7 -user deploy -execdir rm -- '{}' \;

Add Default SSH session timeout

ISSUE
For security purposes our servers should drop ssh sessions after a specified amount of inactivity. We want to be resilient on poor network connections though.

PROPOSED SETTINGS
ClientAliveInterval 120
ClientAliveCountMax 5

In the proposed settings, the client would have to fail at five consecutive 2-minute checks (i.e. 10 minutes), but would remain alive if the system provides a successful response in any one of the tries. In the original, the session dies immediately on a single check at 10 minutes with no retries.

This needs to go in /etc/ssh/sshd_config
See sample at https://github.com/curationexperts/ansible-hydra/blob/master/roles/system_setup/templates/ssh_config

Bundler install failures

I'm seeing this in Tufts and then trying to run from-cm as it's currently checked in. I've tried against both ubuntu 16.04 and 16.10 and get the same error.

Testing from the command line I get the following:

fatal: [54.197.209.70]: FAILED! => {"changed": false, "cmd": "/usr/local/bin/gem install --no-user-install --no-document bundler", "failed": true, "msg": "ERROR:  Error installing bundler:\n\t\"bundle\" from bundler conflicts with /usr/local/bin/bundle", "rc": 1, "stderr": "ERROR:  Error installing bundler:\n\t\"bundle\" from bundler conflicts with /usr/local/bin/bundle\n", "stderr_lines": ["ERROR:  Error installing bundler:", "\t\"bundle\" from bundler conflicts with /usr/local/bin/bundle"], "stdout": "", "stdout_lines": []}
	to retry, use: --limit @/Users/mark/Documents/workspace/_no_backup_/deploying_hydra/frbm-cm/build_cypripedium_server.retry

PLAY RECAP ***************************************************************************************************
54.197.209.70              : ok=27   changed=20   unreachable=0    failed=1   

MARKs-MacBook-Pro-2:frbm-cm mark$ ssh [email protected]
Welcome to Ubuntu 16.10 (GNU/Linux 4.8.0-56-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  Get cloud support with Ubuntu Advantage Cloud Guest:
    http://www.ubuntu.com/business/services/cloud

0 packages can be updated.
0 updates are security updates.

Your Ubuntu release is not supported anymore.
For upgrade information, please visit:
http://www.ubuntu.com/releaseendoflife

New release '17.04' available.
Run 'do-release-upgrade' to upgrade to it.


*** System restart required ***
Last login: Fri Nov  3 00:32:40 2017 from 67.220.22.48
ubuntu@demo:~$ sudo gem install bundler
bundler's executable "bundle" conflicts with /usr/local/bin/bundle
Overwrite the executable? [yN]  n
ERROR:  Error installing bundler:
	"bundle" from bundler conflicts with /usr/local/bin/bundle
ubuntu@demo:~$ bundle --version
Bundler version 1.16.0
ubuntu@demo:~$ gem install bundler
Fetching: bundler-1.16.0.gem (100%)
ERROR:  While executing gem ... (Gem::FilePermissionError)
    You don't have write permissions for the /usr/local/lib/ruby/gems/2.4.0 directory.
ubuntu@demo:~$ 

Create .pgpass file for ubuntu user

One of the most time-consuming maintenance tasks on ansible-samvera built systems is looking up the postgres password. Instead, let's record it in a secure way and make the ubuntu user not have to type it.

  • Follow the guide at https://wiki.postgresql.org/wiki/Pgpass
  • The postgres password should be recorded in a known place on the system
  • The ubuntu user shouldn't need to type the password in order to connect to postgres

Cache fits install package somewhere

Today the fits download site went offline. I hope it comes back soon, and in the meantime I've been able to copy the .zip file from previously installed machines, but we should cache important installers somewhere for these kinds of outages.

Have each role treat the install directory consistently

See ffmpeg vs. fits. Fits will inherit project-wide {{ install_path }}. ffmpeg assumes a ~/install exists. Pick one pattern or the other and make sure all roles ensure that their expected install directory exists at the beginning of the role.

Nailgun for imagemagick?

Talk to Jamie about whether we could use the nailgun pattern for imagemagick and ffmpeg as well as fits

Things missing from a different source Ubuntu image

I've created a Packer build using these roles and it creates a VM output. In the process, I found there were slight differences between the VM source image I used and the source Ubuntu image/AMI you all must be using. I'm debating submitting PRs for these but for the moment have just handled them via a shell script in Packer.

I thought I'd document them here in case others use these roles and run into them.

Missing: nokogiri and nodejs (or any JavaScript runtime, but NodeJS was the one I chose to install)

The environment also needed to be set to UTF-8 for the Postgres databases to be created using that rather than LATIN1: update-locale LANG="en_US.UTF-8" LANGUAGE="en_US" (Having the databases created as LATIN1 caused the Fedora environment fail to start -- though this wasn't noticed in the Fedora role but in the last step of the first_deploy role where the admin set is created).

Fwiw, I also had to set the perms on /var/www to deploy:deploy (though for some reason you all aren't needing to do this, I guess). But this https://stackoverflow.com/questions/24470520/capistrano-mkdir-permission-denied solved a perm problem for me.

If you think it's worth splitting some of these into smaller tickets and getting rid of this all-in-one ticket, that's fine with me.

Make sure we aren't storing anything in the tmp directory of the app in production

set up appropriate configurations for all content typically stored in $RAILS_ROOT/tmp in development environments

https://github.com/curationexperts/epigaea/blob/80b78ded44ee21ee3ead3c7b19497f99f37c6fb6/config/deploy.rb#L37-L49
append :linked_dirs, "public/assets"
append :linked_dirs, "tmp/pids"
append :linked_dirs, "tmp/cache"
append :linked_dirs, "tmp/sockets"
append :linked_dirs, "log"

link the draft dir specified in config/environments/produciton.rb config.drafts_storage_dir

append :linked_dirs, "tmp/drafts"

link the draft dir specified in config/environments/production.rb config.exports_storage_dir

append :linked_dirs, "tmp/exports"

link the template dir specified in config/environments/produciton.rb config.templates_storage_dir

append :linked_dirs, "tmp/templates"

Make ssh timeout configurable

We've hardcoded the ssh timeout into our config and may have cases (like building VMs or training EC2 instances for Camps) where we either want longer or no timeouts.

Possible options would be to make the timeouts settings configurable variables, or to pull out the timeout setting and make it a separate role:

  roles:
    ...
    - { role: sshd_config,  ClientAliveInterval=0 }
    ...

OR

  roles:
    ...
    - { role: sshd_config }
    - { role: sshd_timeout }
    ...

Don't hardcode the `deploy` user

Compare this repo's usage https://github.com/curationexperts/ansible-samvera/search?utf8=โœ“&q=deploy&type=
to ansible-hydra usage https://github.com/curationexperts/ansible-hydra/search?utf8=โœ“&q=capistrano_user&type=

If for any reason we use a different user name to run the rails application - like when building a Vagrant VM for Samvera Camp, you can't use any of the scripts that assume there's a user named deploy. For my specific case, I was going to try to leverage the sidekiq role to get sidekiq running as a system service, but the role is hardcoded to use the deploy user which doesn't exist in the VM environment. Ideally I could use the role, but with the vagrant user.

Shut-down ansible-hydra

ACCEPTANCE

  • Notify potential users of deprecation
    • Schedule call with D. Sanford
  • Merge into ansible-samvera as deprecated release (or find other means to support builds based on ansible-hydra until all DCE systems rebuilt)
  • Delete ansible-hydra repo

Add server restart role

It looks like after some packages are installed via apt get upgrade, apt get update, the server needs a restart. Rather than adding this to the packages role, however, let's make it a separate role that can be placed at the end of a playbook. That ensures it captures any other changes made during the configuration and proves that the web server, solr, and fedora restart correctly after a reboot.

Configure passenger to allow encoded slashes

Hyrax's default admin set contains a slash ("admin_set/default"), which leads to a bug where it can't be displayed unless passenger is configured to allow encoded slashes. For more context, see:

The fix is to add this line to conf-enabled/passenger.conf:

  PassengerAllowEncodedSlashes on
  • Add this to ansible so it's set by default for future builds
  • Check existing projects (e.g., demo.curationexperts.com, FRBM) and ensure it's added there

Cap hyrax:ensure_default_admin_set doesn't exist in all codebases

I'm getting

TASK [first_deploy : ensure default admin set with capistrano] ********************************************************************************
fatal: [52.206.81.96]: FAILED! => {"changed": true, "cmd": "cap localhost hyrax:ensure_default_admin_set", "delta": "0:00:00.287908", "end": "2017-11-03 20:28:26.710093", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2017-11-03 20:28:26.422185", "stderr": "(Backtrace restricted to imported tasks)\ncap aborted!\nDon't know how to build task 'hyrax:ensure_default_admin_set' (see --tasks)\n\n(See full trace by running task with --trace)", "stderr_lines": ["(Backtrace restricted to imported tasks)", "cap aborted!", "Don't know how to build task 'hyrax:ensure_default_admin_set' (see --tasks)", "", "(See full trace by running task with --trace)"], "stdout": "", "stdout_lines": []}
	to retry, use: --limit @/Users/mark/Documents/workspace/_no_backup_/deploying_hydra/tufts-cm/tufts-ubuntu.retry

Because of https://github.com/curationexperts/ansible-samvera/blob/v1.1.0/roles/first_deploy/tasks/main.yml#L80-L83

- name: ensure default admin set with capistrano
  shell: cap localhost hyrax:ensure_default_admin_set
  args:
    chdir: /home/{{ ansible_ssh_user }}/{{ project_name }}

But the hyrax:ensure_default_admin_set Capistrano task isn't defined in all of our projects. Could we just call the rake task directly?
execute :rake, 'hyrax:default_admin_set:create'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.