Giter VIP home page Giter VIP logo

atmosphere-ansible's Introduction

THIS REPOSITORY IS READONLY AND NO LONGER ACTIVE

Atmosphere's Ansible Instance Deployment Setup

This is an extension point for Atmosphere. It defines (with Ansible) how Atmosphere's instances deploy within the targeted cloud provider.

To look at how Atmosphere utilizes this module, look at the service.deploy.py and our subspace module.

Extension Point?

We use this term to denote that an installation of Atmosphere could customize / modify the actions performed when deploying virtual machine instances to a cloud provider by alter the playbooks defined here.

Execution Order

The number prefixing the name of a playbook (example: 20_atmo_user_install.yml) is used to determine the order of execution by subspace. So 20_atmo_user_install.yml happens after 18_atmo_local_user_account.yml. And, 66_atmo_user_ssh_keys.yml will happen last even though it appears second in a directory listing of playbooks. To prevent this, ensure that playbook names do not exceed a prefix of 100. Here is a list of currently executed playbooks:

ansible/playbooks/instance_deploy/
├── 00_setup_ssh.yml
├── 10_setup_pkg_mgr.yml
├── 18_atmo_local_user_account.yml
├── 20_atmo_user_install.yml
├── 30_post_user_install.yml
├── 41_shell_access.yml
└── 42_globus_connect.yml

After instance_deploy, these playbooks are run to add user's SSH keys and run their boot scripts.

ansible/playbooks/user_deploy/
├── 00_inject_ssh_keys.yml
└── 10_post_boot.yml

Configuring atmosphere-ansible

Atmosphere-ansible provides optional functionality that is used in some, but not all deployments.

Enable these optional configurations by setting the corresponding variable to true (e.g. in your Ansible group_vars).

Variable Purpose
PACKAGE_NO_UPDATES Disables package cache updates and latest installs; only checks that packages are present
PACKAGE_RETRIES Number of additional attempts to retry installing packages (total attempts is +1 to this number (default is 6)
PACKAGE_DELAY Number of seconds between each PACKAGE_RETRIES (default is 20)
SETUP_DHCP_CLIENT DHCP client
SETUP_LDAP LDAP client for user authentication
SETUP_LOCAL_USER_ACCOUNT Local user account (always runs when SETUP_LDAP not true)
SETUP_IRODS_ICOMMANDS iRODS iCommands and iRODS FUSE client
SETUP_REALVNC_SERVER RealVNC server for Atmosphere Web Desktop feature
SETUP_GLOBUS_CONNECT Globus Connect
SETUP_GUACAMOLE Apache Guacamole
SETUP_NOVNC NoVNC VNC Client
SETUP_GUI_BROWSER Web browser on instances with a GUI
SET_DESKTOP_BACKGROUND Set desktop background for instances with a GUI
SETUP_ATMO_BACKUP Deploy cyverse_backup script

Guacamole Information

Guacamole is a VNC and SSH gateway. If using this feature, GUACAMOLE_SERVER_IP must also be defined.

Security Warning: the Guacamole remote desktop requires unencrypted VNC connections from the Guacamole server to your target instances. Ensure that your Guacamole server connects to instances via a trusted network where no unauthorized parties can listen to network traffic. If these connections transit an untrusted network, anyone listening on the wire would get everything from the unencrypted VNC sessions.

Additional Playbooks

Utility Playbooks

These playbooks are called separately via subspace to verify things such as network connectivity and VNC status for Atmosphere deployments.

ansible/playbooks/utils/
├── atmo_check_novnc.yml
├── atmo_check_vnc.yml
├── check_networking.yml
├── play_role.yml
└── print_variables.yml

Instance Action Playbooks

This directory contains playbooks for performing additionaly actions on an instance post-deploy, such as mounting volumes.

ansible/playbooks/instance_actions/
├── check_volume.yml
├── mount_volume.yml
└── unmount_volume.yml

User Customization Playbooks

This directory will contain playbooks that can be selected by a user and installed. They may contain additional metadata that will help users understand the playbooks intention additionally they may take different arguments depending on the purpose of the playbook.

ansible/playbooks/user_customizations/
├── README.md
├── add_user.yml
└── remove_user.yml

Imaging Playbooks

This directory contains a playbook prepare_instance_snapshot.yml that syncs and freezes an instance to prepare it for imaging process.

Troubleshooting Instances

See this troubleshooting page: Troubleshooting Atmosphere Ansible

Contributing to atmosphere-ansible

Generally, new roles should be created using ansible-role-template using Ansible Galaxy. Optionally, test roles using Travis CI as detailed here.

License

See LICENSE file.

atmosphere-ansible's People

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

atmosphere-ansible's Issues

Sync hwclock with system time, and stop NTP before and after

Some systems have their time set in the future, which causes all sorts of issues. It appears that some suspended instances do not correct. We are reading from hwclock to bring time closer, which appeared to work, but it may instead be necessary to get time from outside, say with curl then set the hwclock using hwclock --systohc. A fix for this was made with issue #30 but without seeing output from ELK is difficult to determine if the curl is working.

One can read the current time like this: hwclock -r

Steps to fix this:

  1. Stop ntpd, and be sure to handle errors if ntpd is not already installed
  2. Get time from an outside source, like this: date -s "$(curl -s --head http://google.com | grep ^Date: | sed 's/Date: //g')"
  3. Sync current time back to HW using system time like this: hwclock --systohc
  4. Install NTP
  5. Start service

SSHKEYS once added cannot be removed by a user

  1. A user adds a friend's key to their box in atmosphere
  2. User redeploys and see's friend's key
  3. Friend goes hostile
  4. User removes key from atmosphere and redeploys
  5. Friend still has access, and makes user 😞

Duplicate user lines in /etc/sudoers

/etc/sudoers

my_username  ALL = (ALL) NOPASSWD: ALL
%users ALL=(ALL) NOPASSWD: ALL
my_username ALL=(ALL) NOPASSWD:ALL
my_username  ALL=(ALL) ALL

This probably could be solved with a blockinfile.

python2 doesn't exist on instance, required by ansible

The following section detects python 3 installs python 2:
https://github.com/cyverse/attmosphere-ansible/blob/0a352c14769e382dd8eb7928bb3294fdf97186a1/ansible/roles/atmo-ssh-setup/tasks/main.yml#L57-L68

However it doesn't detect python 3 correctly. Testing whether a user has ssh is not sufficient. @c-mart found that an Ubuntu 16.04 instance had ssh access as root, which resulted in no python2 install.

A simple fix (and what seems more correct) is to ensure that python 2 independent even of the distro.

Remove duplicate `Port` entry for `sshd_config`

Ubuntu 14.04.2 XFCE Base instance contains both:

Port 1657
Port 22

Should add either lineinfile to remove all entries and add just the desired port for SSH using regex to remove, then add again. Only then SSH can be restarted

Problem: absolute path to `basename` used in backup script

The script that is deployed and available for backup/restore is using /bin/basename instead of locating the basename executable. This means that a change for basename to /usr/bin/basename breaks the script.

For an Ubuntu 16.04 instance ...

lenards@vm000-00:~$ ls -lha /bin/ | grep basename
lenards@vm000-100:~$ ls -lha /usr/bin/ | grep basename
-rwxr-xr-x  1 root   root     31K Mar  2  2017 basename
lenards@vm000-100:~$ ls -lha /sbin/ | grep basename

One approach might be to capture the location in a variable, BASENAME=$(which basename), and then use that in determined the remote location.

Readme is out of date

The sections about playbooks is wrong. The playbooks have been moved into sub directories.

Timezone and zone_info listed as separate variables

AFAIK there is always a symlink for the "proper name" of a timezone that redirects back to the more basic result. Here is an example:

(troposphere) root@r01c3b08:/opt/dev/troposphere# ll /usr/share/zoneinfo/America/Chicago
lrwxrwxrwx 1 root root 13 Oct 20  2015 /usr/share/zoneinfo/America/Chicago -> ../US/Central
(troposphere) root@r01c3b08:/opt/dev/troposphere# ll /usr/share/zoneinfo/US/Central
-rw-r--r-- 1 root root 3559 Oct 20  2015 /usr/share/zoneinfo/US/Central
(troposphere) root@r01c3b08:/opt/dev/troposphere# md5sum /usr/share/zoneinfo/US/Central
d0f076c9f390e7d8a933cc7cc1ad2e90  /usr/share/zoneinfo/US/Central
(troposphere) root@r01c3b08:/opt/dev/troposphere# md5sum /usr/share/zoneinfo/America/Chicago
d0f076c9f390e7d8a933cc7cc1ad2e90  /usr/share/zoneinfo/America/Chicago

Atmosphere-ansible should remove the ZONE_INFO variable and concatenate the TIME_ZONE variable to (A default variable for) ZONE_INFO_DIR.

Write fall back in docs for when ELK dies

Once logged in, perform some basic tests

Is Ansible still running? If so, check the ELK server for Ansible deployment logs and check for any deployment errors in the final report.

Need clarification for what to do when ELK is not up and running. @nfaction

Limit `chown` & `chmod` to a single filesystem

Otherwise network filesystems mounted in home directory will cause instance resume to fail catastrophically. Not even mentioning breaking file permissions on remote network file systems.

When vms get a new IP address, web desktop breaks

I helped triage the following ticket where a user had a desktop, but applications failed to launch. In fact I could not launch applications from the terminal.

I tried the following:

export DISPLAY=<can be :2 for vnc or :5 for guacamole>
export AUTHORITY=<path to Xauthority usually in ~/.Xauthority>
$ firefox 
No protocol specified
Failed to connect to Mir: Failed to connect to server socket: No such file or directory
Unable to init server: Could not connect: Connection refused
Error: cannot open display: :5

When I ran xauth list there were no cookies for the current vm's new ip/hostname. This was the issue. The cookies must match the current hostname. All I had to do was add new cookies. See this well written stackoverflow answer:
https://stackoverflow.com/questions/20611783/after-changing-hostname-gedit-and-other-x-clients-dont-open/20612084#20612084

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.