Giter VIP home page Giter VIP logo

ops-cli's Introduction

Ops CLI

Build status Docker image License

From version 2.0 onward, ops-cli requires Python3.
If you're still using Python2, use ops-cli version <2.0

ops-cli is a python wrapper for Terraform, Ansible and SSH for cloud automation.

We use multiple tools to manage our infrastructure at Adobe. The purpose of ops-cli is to gather the common cluster configurations in a single place and, based on these, interact with the above mentioned tools. In this way, we can avoid duplication and can quickly spin up new clusters (either production or development ones). All we need to do is customize the cluster configuration file (example here).

ops-cli integrates with the Azure and AWS cli, in order to provide inventory, ssh, sync, tunnel and the possibility to run ansible playbooks on a fleet of EC2 instances. It can be used to add a layer of templating (using jinja2) on top of Terraform files. This is useful for removing duplicated code when it comes to spinning up infrastructure across multiple environments (stage/sandbox/prod) and across teams. Useful for both AWS and Kubernetes deployments.

Table of Contents

How it works?

You define a cluster configuration, using a yaml file. The yaml file contains different kind of sections, one for each plugin. For instance, you could have a section for Terraform files, a section for AWS instructions, Kubernetes Helm charts and so forth.

Use cases

Manage AWS EC2 instances

Once you define your cluster configuration, you can run ops commands such as seeing the instance inventory.

# fetch instances from AWS and prints them
ops clusters/mycluster.yaml inventory --limit webapp 

This would output something like: ops

Then you can run ssh, play, run, sync etc.

# SSH to one of the nodes (can handle bastion as well)
ops clusters/mycluster.yaml ssh webapp-01

# run a deployment playbook via ansible
ops clusters/mycluster.yaml play ansible/playbooks/task/webapp/deployment.yaml -- -e version=5.36.2 -u ec2-user --limit webapp

# run command on all selected nodes
ops clusters/mycluster.yaml run "sudo yum upgrade myawesomeapp; sudo service myawesomeapp restart" -- -u ec2-user --limit '"aam_app_group=canary;az=us-east-1a"'

# copy file to all servers
ops clusters/mycluster.yaml sync /tmp/myfile webapp: -l ec2-user

# create a tunnel
ops clusters/stage.yaml ssh --tunnel --local 8080 --remote 8080 stage-thanos-1 -l ec2-user

See examples/features/inventory

Terraform

# Performs jinja templating (if any) and runs terraform plan
ops clusters/mycluster.yaml terraform --path-name aws-eks plan

# Run terraform apply, with the possibility to sync the tf state files remotely (currently, AWS S3 bucket is supported + DynamoDB for locking). 
ops clusters/mycluster.yaml terraform --path-name aws-eks apply

ops-terraform

Run terraform by using hierarchical configs

See examples/features/terraform-hierarchical

Create Kubernetes cluster (using AWS EKS)

See examples/aws-kubernetes

Installing

Local

Virtualenv

Here is a link about how to install and use virtualenv: https://virtualenv.pypa.io/en/stable/

Ops tool installation

Python 3

# Make sure pip is up to date
curl https://bootstrap.pypa.io/get-pip.py | python3

# Install virtualenv
pip install --upgrade virtualenv
pip install --upgrade virtualenvwrapper

echo 'export WORKON_HOME=$HOME/.virtualenvs' >> ~/.bash_profile
echo 'source /usr/local/bin/virtualenvwrapper.sh' >> ~/.bash_profile
source ~/.bash_profile

# create virtualenv
mkvirtualenv ops
workon ops

# uninstall previous `ops` version (if you have it)
pip uninstall ops --yes

# install ops-cli v2.2.1 stable release
pip install --upgrade ops-cli

Terraform

Optionally, install terraform to be able to access terraform plugin. See https://www.terraform.io/intro/getting-started/install.html Also for pretty formatting of terraform plan output you can install https://github.com/coinbase/terraform-landscape (use gem install for MacOS)

Using docker image

You can try out ops-cli, by using docker. The docker image has all required prerequisites (python, terraform, helm, git, ops-cli etc).

To start out a container, running the latest ops-cli docker image run:

docker run -it ghcr.io/adobe/ops-cli:2.2.1 bash

After the container has started, you can start using ops-cli:

ops help
# usage: ops [-h] [--root-dir ROOT_DIR] [--verbose] [-e EXTRA_VARS]
#           cluster_config_path
#           {inventory,terraform,packer,ssh,play,run,sync,noop} ...

git clone https://github.com/adobe/ops-cli.git
cd ops-cli
ls examples
# aws-kubernetes
# cassandra-stress
# features

cd examples/aws-kubernetes
ops clusters/my-kubernetes-cluster.yaml terraform --path-name aws-eks plan
# in order to setup aws-kubernetes follow the steps from https://github.com/adobe/ops-cli/blob/master/examples/aws-kubernetes/README.md

Configuring

AWS

If you plan to use ops with AWS, you must configure credentials for each account

$ aws configure --profile aws_account_name

Azure

TBD

Examples

See examples/ folder:

  • cassandra-stress - n-node cassandra cluster used for stress-testing; a basic stress profile is included
  • spin up a Kubernetes cluster
  • distinct ops features

Usage help

To see all commands and a short description run ops --help

usage: ops [-h] [--root-dir ROOT_DIR] [--verbose] [-e EXTRA_VARS]
           cluster_config_path
           {inventory,terraform,packer,ssh,play,run,sync,noop} ...

Run commands against a cluster definition

positional arguments:
  cluster_config_path   The cluster config path cluster.yaml
  {inventory,terraform,packer,ssh,play,run,sync,noop}
    inventory           Show current inventory data
    terraform           Wrap common terraform tasks with full templated
                        configuration support
    packer              Wrap common packer tasks and inject variables from a
                        cluster file
    ssh                 SSH or create an SSH tunnel to a server in the cluster
    play                Run an Ansible playbook
    run                 Runs a command against hosts in the cluster
    sync                Sync files from/to a cluster
    noop                used to initialize the full container for api usage

optional arguments:
  -h, --help            show this help message and exit
  --root-dir ROOT_DIR   The root of the resource tree - it can be an absolute
                        path or relative to the current dir
  --verbose, -v         Get more verbose output from commands
  -e EXTRA_VARS, --extra-vars EXTRA_VARS
                        Extra variables to use. Eg: -e ssh_user=ssh_user

More help

Each sub-command includes additional help information that you can get by running: ops examples/inventory/aam.yaml sync --help

Tool configuration: .opsconfig.yaml

Some tool settings are available via a .opsconfig.yaml configuration file. The file is looked-up in /etc/opswrapper/.opsconfig.yaml, then in ~/.opsconfig.yaml and then in the project folder starting from the current dir and up to the root dir. All the files found this way are merged together so that you can set some global defaults, then project defaults in the root dir of the project and overwrite them for individual envs. Eg: ~/.opsconfig.yaml, /project/.opsconfig.yaml, /project/clusters/dev/.opsconfig.yaml

Inventory

The inventory command will list all the servers in a given cluster and cache the results for further operations on them (for instance, SSHing to a given node or running an ansible playbook).

You can always filter which nodes you want to display or use to run an ansible playbook on, by using the --limit argument (eg. --limit webapp). The extra filter is applied on the instance tags, which includes the instance name.

The way inventory works is by doing a describe command in AWS/Azure. The describe command matches all the nodes that have the tag "cluster" equal to the cluster name you have defined.

In order to configure it, you need to add the inventory section in your cluster configuration file (example here).

AWS example

---
inventory:
  - plugin: cns
    args:
      clusters:
        - region: us-east-1
          boto_profile: aam-npe # make sure you have this profile in your ~/.aws/credentials file
          names: [mycluster1] # this assumes the EC2 nodes have the Tag Name "cluster" with Value "mycluster1"

Azure example

---
inventory:
  - plugin: azr
    args:
      tags: environment=prod
      locations: westeurope,northeurope

Inventory usage

usage: ops cluster_config_path inventory [-h] [-e EXTRA_VARS]
                                         [--refresh-cache] [--limit LIMIT]
                                         [--facts]

optional arguments:
  -h, --help            show this help message and exit
  -e EXTRA_VARS, --extra-vars EXTRA_VARS
                        Extra variables to use. Eg: -e ssh_user=ssh_user
  --refresh-cache       Refresh the cache for the inventory
  --limit LIMIT         Limit run to a specific server subgroup. Eg: --limit
                        newton-dcs
  --facts               Show inventory facts for the given hosts

Terraform

usage: ops cluster_config_path terraform [-h] [--var VAR] [--module MODULE]
                                         [--resource RESOURCE] [--name NAME]
                                         [--plan]
                                         subcommand

positional arguments:
  subcommand           apply | console | destroy | import | output | plan |
                       refresh | show | taint | template | untaint

optional arguments:
  -h, --help            show this help message and exit
  --var VAR             the output var to show
  --module MODULE       for use with "taint", "untaint" and "import". The
                        module to use. e.g.: vpc
  --resource RESOURCE   for use with "taint", "untaint" and "import". The
                        resource to target. e.g.: aws_instance.nat
  --name NAME           for use with "import". The name or ID of the imported
                        resource. e.g.: i-abcd1234
  --plan                for use with "show", show the plan instead of the
                        statefile
  --skip-refresh        for use with "plan". Skip refresh of statefile
  --raw-output          for use with "plan". Show raw plan output without piping through terraform landscape (if terraform landscape is not enabled in opsconfig.yaml this will have no impact)
  --path-name PATH_NAME in case multiple terraform paths are defined, this
                        allows to specify which one to use when running
                        terraform

    Examples:
        # Create a new cluster with Terraform
        ops clusters/qe1.yaml terraform plan
        ops clusters/qe1.yaml terraform apply

        # Update an existing cluster
        ops clusters/qe1.yaml terraform plan
        ops clusters/qe1.yaml terraform apply

        # Get rid of a cluster and all of its components
        ops clusters/qe1.yaml terraform destroy

        # Retrieve all output from a previously created Terraform cluster
        ops clusters/qe1.yaml terraform output

        # Retrieve a specific output from a previously created Terraform cluster
        ops clusters/qe1.yaml terraform output --var nat_public_ip

        # Refresh a statefile (no longer part of plan)
        ops clusters/qe1.yaml terraform refresh

        # Taint a resource- forces a destroy, then recreate on next plan/apply
        ops clusters/qe1.yaml terraform taint --module vpc --resource aws_instance.nat

        # Untaint a resource
        ops clusters/qe1.yaml terraform untaint --module vpc --resource aws_instance.nat

        # Show the statefile in human-readable form
        ops clusters/qe1.yaml terraform show

        # Show the plan in human-readable form
        ops clusters/qe1.yaml terraform show --plan

        # View parsed jinja on the terminal
        ops clusters/qe1.yaml terraform template

        # Import an unmanaged existing resource to a statefile
        ops clusters/qe1.yaml terraform import --module vpc --resource aws_instance.nat --name i-abcd1234

        # Use the Terraform Console on a cluster
        ops clusters/qe1.yaml terraform console

        # Validate the syntax of Terraform files
        ops clusters/qe1.yaml terraform validate

        # Specify which terraform path to use
        ops clusters/qe1.yaml terraform plan --path-name terraformFolder1

Terraform landscape

For pretty formatting of terraform plan output you can install https://github.com/coinbase/terraform-landscape (use gem install for MacOS). To make ops use it you need to add terraform.landscape: True in opsconfig.yaml file.

SSH

usage: ops cluster_config_path ssh [-h] [-e EXTRA_VARS] [-l USER]
                                   [--ssh-config SSH_CONFIG] [--index INDEX]
                                   [--tunnel] [--ipaddress] [--local LOCAL]
                                   [--remote REMOTE] [--proxy] [--nossh]
                                   role [ssh_opts [ssh_opts ...]]

positional arguments:
  role                  Server role to ssh to. Eg: dcs
  ssh_opts              Manual ssh options

optional arguments:
  -h, --help            show this help message and exit
  -e EXTRA_VARS, --extra-vars EXTRA_VARS
                        Extra variables to use. Eg: -e ssh_user=ssh_user
  -l USER, --user USER  SSH User
  --ssh-config SSH_CONFIG
                        Ssh config file name in the ./ansible dir
  --index INDEX         Index of the server from the group
  --tunnel              Use SSH tunnel, must pass --local and --remote
  --ipaddress
  --local LOCAL         local port for ssh proxy or ssh tunnel
  --remote REMOTE       remote port for ssh tunnel
  --proxy               Use SSH proxy, must pass --local
  --nossh               Port tunnel a machine that does not have SSH. Implies
                        --ipaddress, and --tunnel; requires --local and
                        --remote
  --keygen              Create a ssh keys pair to use with this infrastructure
  --noscb               Disable use of Shell Control Box (SCB) even it is
                        enabled in the cluster config
  --auto_scb_port       When using Shell Control Box (SCB) and creating a
                        proxy,a random port is generated, which will be used
                        in the ssh config for all playbook, run and sync
                        operations

    Examples:
        # SSH using current username as remote username
        ops clusters/qe1.yaml ssh nagios

        # SSH using a different username
        ops clusters/qe1.yaml ssh nagios -l ec2-user

        # SSH to the second nagios instance
        ops clusters/qe1.yaml ssh nagios --index 2

        # SSH to a specific hostname, instead of the tagged role
        ops clusters/qe1.yaml ssh full-hostname-here-1

        # Create an SSH tunnel to Nagios forwarding the remote port 80 to local port 8080
        ops clusters/qe1.yaml ssh --tunnel --remote 80 --local 8080 nagios

        # Create an SSH tunnel to a host where the service is NOT listening on `localhost`
        ops clusters/qe1.yaml ssh --tunnel --remote 80 --local 8080 nagios --ipaddress

        # Create an SSH tunnel to a host with an open port which does NOT have SSH itself (Windows)
        # Note that the connection will be made from the Bastion host
        ops clusters/qe1.yaml ssh --tunnel --local 3389 --remote 3389 --nossh windowshost

        # Create a proxy to a remote server that listens on a local port
        ops clusters/qe1.yaml ssh --proxy --local 8080 bastion

        # In case Shell Control Box (SCB) is configured and enabled on the cluster a proxy which
        # will be used by all ops play, run and sync operations, can be created either using
        # either the port configured the cluster config file or an auto generated port.
        # In this case --local param must not be used
        # Example for using the port configured in the cluster config
        ops clusters/qe1.yaml ssh bastion --proxy
        # Example for using the auto generated port
        ops clusters/qe1.yaml ssh bastion --proxy --auto_scb_port


        # Disable use of Shell Control Box (SCB) even it is enabled in the cluster config
        ops clusters/qe1.yaml ssh bastion --noscb

SSHPass

In case you want to use the OSX Keychain to store your password and reuse across multiple nodes (e.g. running a playbook on 300 nodes and not having to enter the password for every node) follow the tutorial below:

  1. Open Keychain Access app on OSX

  2. Create a new keychain (File -> New Keychain), let's say aam

  3. Select the aam keychain and add a new password entry in this (File -> New Password Item): - Name: idm - Kind: application password - Account: your_ldap_account (e.g. johnsmith) - Where: idm

  4. Create $HOME/bin dir - this is where the scripts below are saved

  5. Create ~/bin/askpass script and update the ldap account there:

cat > ~/bin/askpass  <<"EOF"
#!/usr/bin/env bash
/usr/bin/security find-generic-password -a <your_ldap_account> -s idm -w $HOME/Library/Keychains/aam.keychain
EOF
chmod +x ~/bin/askpass
  1. Checkout notty github repo, build and move the binary to $HOME/bin/

  2. Create ~/bin/sshpass script:

cat > $HOME/bin/sshpass <<"EOF"
#!/usr/bin/env bash
export DISPLAY=:99
export SSH_ASKPASS="$HOME/bin/askpass"
[[ $1 == -d* ]] && shift
$HOME/bin/notty $@
EOF

chmod +x $HOME/bin/sshpass
  1. Verify the setup works:
# Connect to bastion
~/bin/sshpass ssh -o StrictHostKeyChecking=no -l <your_ldap_account> <52.5.5.5>
  1. Run ops tool

SCB

Shell Control Box (SCB) is an activity monitoring appliance from Balabit (now One Identity) that controls privileged access to remote servers. ops has support for using SCB as ssh proxy for the following operations: ssh, tunnel, proxy, ansible play, run and sync

In order to use SCB an extra section needs to be added to the cluster config file:

scb:
  enabled: true
  host: "scb.example.com"
  proxy_port: 2222 # optional

Having this config all ssh operations will be done via the scb host, unless the --noscb flag is used.

When using SCB, SSHPass will not be used.

For ansible play, run and sync operations to work via SCB a proxy needs to be created first and then run ops in a different terminal window or tab:

# 1. Create a proxy in a terminal window
# Example for using the port configured in the cluster config
ops clusters/qe1.yaml ssh bastion --proxy
# Example for using the auto generated port
ops clusters/qe1.yaml ssh bastion --proxy --auto_scb_port

# 2. Run the play/run/sync command normally in a different terminal window or tab
# A message will indicate the scb proxy is used
ops clusters/qe1.yaml play ansible/plays/cluster/configure.yaml
...
Connecting via scb proxy at 127.0.0.1:2222.
This proxy should have already been started and running in a different terminal window.
If there are connection issues double check that the proxy is running.
...

Play

Run an ansible playbook.

usage: ops cluster_config_path play [-h] [-e EXTRA_VARS] [--ask-sudo-pass]
                                    [--limit LIMIT] [--noscb]
                                    playbook_path
                                    [ansible_args [ansible_args ...]]

positional arguments:
  playbook_path         The playbook path
  ansible_args          Extra ansible args

optional arguments:
  -h, --help            show this help message and exit
  -e EXTRA_VARS, --extra-vars EXTRA_VARS
                        Extra variables to use. Eg: -e ssh_user=ssh_user
  --ask-sudo-pass       Ask sudo pass for commands that need sudo
  --limit LIMIT         Limit run to a specific server subgroup. Eg: --limit
                        newton-dcs
  --noscb               Disable use of Shell Control Box (SCB) even if it is
                        enabled in the cluster config

    Examples:
        # Run an ansible playbook
        ops clusters/qe1.yaml play ansible/plays/cluster/configure.yaml

        # Limit the run of a playbook to a subgroup
        ops clusters/qe1.yaml play ansible/plays/cluster/configure.yaml -- --limit dcs

        # Overwrite or set a variable
        ops clusters/qe1.yaml play ansible/plays/cluster/configure.yaml -- -e city=paris

        # Filter with tags
        ops clusters/qe1.yaml play ansible/plays/cluster/configure.yaml -- -t common

        # Run a playbook and overwrite the default user
        ops clusters/qe1.yaml play ansible/plays/cluster/configure.yaml -- -u ec2-user

Run command

Run a bash command on the selected nodes.

usage: ops cluster_config_path run [-h] [--ask-sudo-pass] [--limit LIMIT]
                                   [--noscb]
                                   host_pattern shell_command
                                   [extra_args [extra_args ...]]

positional arguments:
  host_pattern     Limit the run to the following hosts
  shell_command    Shell command you want to run
  extra_args       Extra ansible arguments

optional arguments:
  -h, --help       show this help message and exit
  --ask-sudo-pass  Ask sudo pass for commands that need sudo
  --limit LIMIT    Limit run to a specific server subgroup. Eg: --limit
                   newton-dcs
  --noscb          Disable use of Shell Control Box (SCB) even if it is
                   enabled in the cluster config

    Examples:
        # Last 5 installed packages on each host
        ops qe1.yaml run all 'sudo grep Installed /var/log/yum.log | tail -5'

        # See nodetool status on each cassandra node
        ops qe1.yaml run qe1-cassandra 'nodetool status'

        # Complex limits
        ops qe1.yaml run 'qe1-cassandra,!qe1-cassandra-0' 'nodetool status'

        # Show how to pass other args

Sync files

Performs rsync to/from a given set of nodes.

usage: ops cluster_config_path sync [-h] [-l USER] [--noscb]
                                    src dest [opts [opts ...]]

positional arguments:
  src                   Source dir
  dest                  Dest dir
  opts                  Rsync opts

optional arguments:
  -h, --help            show this help message and exit
  -l USER, --user USER  Value for remote user that will be used for ssh
  --noscb               Disable use of Shell Control Box (SCB) even if it is
                        enabled in the cluster config

        rsync wrapper for ops inventory conventions

        Example:

        # rsync from remote dcs role
        ops cluster.yml sync 'dcs[0]:/usr/local/demdex/conf' /tmp/configurator-data --user remote_user

        # extra rsync options
        ops cluster.yml sync 'dcs[0]:/usr/local/demdex/conf' /tmp/configurator-data -l remote_user -- --progress

Noop

usage: ops cluster_config_path noop [-h]

optional arguments:
  -h, --help  show this help message and exit

Packer

Runs packer, for creating images.

usage: ops cluster_config_path packer [-h] subcommand

positional arguments:
  subcommand  build | validate

optional arguments:
  -h, --help  show this help message and exit

    Examples:
        # Validate a packer file
        ops clusters/centos7.yaml packer validate

        # Build a packer file
        ops clusters/centos7.yaml packer build

Secrets Management

There are cases where you need to reference sensitive data in your cluster.yaml file (credentials, passwords, tokens etc). Given that the cluster configuration file can be stored in a version control system (such as Git), the best practice is to not put sensitive data in the file itself. Instead, we can use ops-cli to fetch the desired credentials from a secrets manager such as Vault or Amazon SSM, at runtime.

Vault

Ops can manage the automatic generation of secrets and their push in Vault, without actually persisting the secrets in the cluster file. A cluster file will only need to use a construct like the following:

db_password: "{{'secret/campaign/generated_password'|managed_vault_secret(policy=128)}}"

Which will translate behind the scenes in :

  • look up in vault the secrets at secret/campaign/generated_password in the default key 'value' (Adobe convention that can be overridden with the key parameter)
  • if the value there is missing, generate a new secret using the engine passgen with a policy of length 128 characters
  • return the generated value
  • if the value at that path already exist, just return that value. This allows us to just refer in cluster files a secret that actually exists in vault and make sure we only generate it once - if it was already created by os or any other system, we will just use what is already there. The reference is by means of fixed form jinja call added to the cluster file, which ends up interpreted later during the templating phase.

Amazon Secrets Manager (SSM)

Amazon offers the possibility to use their Secrets Manager in order to manage configuration data such as credentials, passwords and license keys.

We can use ops-cli to fetch the sensitive data from SSM, at runtime. Just define this in your cluster configuration file (eg. mycluster.yaml).

db_password: "{{ '/my/ssm/path' | read_ssm(aws_profile='myprofile') }}"

ops-cli will read the SSM value by running a command similar to: AWS_PROFILE=aam-npe aws ssm get-parameter --name "/my/ssm/path" --region us-east-1 --with-decryption. Note that you can specify the AWS region via read_ssm(aws_profile='myprofile', region_name='us-west-2').

Using jinja2 filters in playbooks and terraform templates

You can register your own jinja2 filters that you can use in the cluster config file, terraform templates and ansible playbooks

All ops commands look for filters in the following locations:

Example simple filter:

# plugins/filter_plugin/myfilters.py

def my_filter(string):
    return 'filtered: ' + string


class FilterModule(object):
    def filters(self):
        return {
            'my_filter': my_filter
        }

# usage in playbook, templates, cluster config
# test_custom_filters: "{{ 'value' | my_filter }}"

SKMS

Create a file in ~/.skms/credentials.yaml which looks like the following:

endpoint: "api.skms.mycompany.com"
username: <username>
password: <password>

Development

Install ops in development mode

git clone https://github.com/adobe/ops-cli.git
cd ops
# Install openssl
brew install openssl libyaml
env LDFLAGS="-L$(brew --prefix openssl)/lib" CFLAGS="-I$(brew --prefix openssl)/include" python setup.py develop

Running tests

  • on your machine: py.test tests

Troubleshooting

  • Permission issues when installing: you should install the tool in a python virtualenv

  • Exception when running: ops pkg_resources._vendor.packaging.requirements.InvalidRequirement: Invalid requirement, parse error at "'!= 2.4'"

    Caused by a broken paramiko version, reinstall paramiko: pip2 uninstall paramiko; pip2 install paramiko

  • Exception when installing ops because the cryptography package fails to install:

Either install the tool in a virtualenv or:

    brew install libffi
    brew link libffi --force
    brew install openssl  
    brew link openssl --force

License

Apache License 2.0

ops-cli's People

Contributors

alanyee avatar amuraru avatar anukaal avatar asalajan avatar asjidkalam avatar astancu avatar carolinestere avatar cmccandless avatar complexsplit avatar costimuraru avatar danielcoman avatar dependabot[bot] avatar filmaj avatar flopp avatar gahag avatar jewee avatar keuhdall avatar mohansha avatar mujx avatar renovate[bot] avatar serban21 avatar sidhyatikku avatar stefandinu avatar sumitkharche avatar synnek1337 avatar titirigaiulian avatar wenzdey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ops-cli's Issues

Vault lib enhancements

Description:
Enhance Vault module, now it will not muffle Exceptions and exposed another function called check_vault() that will return a Jinja2 friendly boolean ("true" or "false")
Added support for enforcing a minimum ops version in Terraform clusters ("ops_min_version")

Motivation and Context
Aims to solve scenarios where reading a non-existent Vault secret always returned "None" and never threw an exception.
Aims to make it easier to enforce a minimum version of ops across larger teams.

Upgrade boto3 and awscli version requirement

Expected Behaviour

Actual Behaviour

Reproduce Scenario (including but not limited to)

Steps to Reproduce

Platform and Version

Sample Code that illustrates the problem

Logs taken while reproducing problem

Can't install ops-cli in dev mode on a fresh python virtualenv

Expected Behaviour

Successfully install ops-cli in development mode on a fresh python virtual env. Command to run: env LDFLAGS="-L$(brew --prefix openssl)/lib" CFLAGS="-I$(brew --prefix openssl)/include" python setup.py develop

Actual Behaviour

Installation is crashing because of multiple pip package version conflicts.

Reproduce Scenario (including but not limited to)

Steps to Reproduce

Step 1
run env LDFLAGS="-L$(brew --prefix openssl)/lib" CFLAGS="-I$(brew --prefix openssl)/include" python setup.py develop

Result:
error: s3transfer 0.2.0 is installed but s3transfer<0.2.0,>=0.1.12 is required by set(['awscli'])

Step 2

Add s3transfer==0.1.13 to requitements.txt and rerun de install command.

Result:
error: s3transfer 0.1.13 is installed but s3transfer<0.3.0,>=0.2.0 is required by set(['boto3'])

Step 3

Set boto3==1.9.88 in reqirements.txt which fixed the s3transfer issue but generated another one

Results:

error: botocore 1.12.94 is installed but botocore==1.12.87 is required by set(['awscli'])

Step 4:

Add botocore==1.12.87 in req which lead to the following error:

Result:

error: botocore 1.12.87 is installed but botocore<1.13.0,>=1.12.88 is required by set(['boto3'])

Step 5:

Set boto3==1.9.87 in req which leat to the final error:

Result:

error: PyYAML 4.2b4 is installed but PyYAML<=3.13,>=3.10 is required by set(['awscli'])

Step 6:

set PyYAML==3.13 in req and the installation was completed successfully

Platform and Version

macOS 10.13.4

Add support for running helmfile on arbitrarily configured clusters

Currently helmfile composition generates kubernetes cluster context information for eks clusters via aws eks update-kubeconfig commands.
Previously if required variables to generate were missing the tool would run with default kubernetes context configured.

Since this has proven to be dangerous running against default context has been removed.
To be able to run helmfile composition against non eks clusters support for explicit context selection needs to be added in ops-cli

Move Opstool to python 3

[Migrated]

Python 2 is EOL in 1 yr 5 months - as such, all of my team's development is being done in Python 3. Mixing Opstool into python 3 projects is a bit cumbersome and requires virtualenvs and other nastiness - maybe it's time to port opstool to python 3? FWIW I'd be willing to do a bunch of the legwork for this, just want to gauge interest before I undertake such a task.

Support ops terraform force-unlock

Expected Behaviour

Useful for cases where the tf states are stored remotely (ie. in an S3 bucket) with DynamoDB locking. If something happens (eg. credentials expire, terraform crashes), it would be useful to have a way to remove the lock that was acquired. Currently, this can only be done manually, by going to DynamoDB and removing the entry (the lock). However, terraform has a command to remove the lock. We should add it in ops as well.

ops mycluster.yaml terraform force-unlock <LOCK_ID> should work.
https://www.terraform.io/docs/commands/force-unlock.html

Actual Behaviour

Not implemented yet.

Platform and Version

v0.23

Inventory should support static bastion host

[Migrated]

For use cases that have a static bastion host hosted in corp network, it would be useful to have a way to specify that in the cluster.yaml file in the inventory section.

Terraform v0.12 - deprecated CLI option `-module-depth`

Terraform v0.12 disables module-depth option for all operations except graph (as of 0.v12-alpha3):
hashicorp/terraform@178ec8f#diff-aa88524f69c6514b7d5568e0b5e03bad

Expected Behaviour

  1. Install Python
  2. pip install https://github.com/adobe/ops-cli/releases/download/0.30/ops-0.30-py2-none-any.whl
  3. brew install terraform # CURRENT VERSION v0.12
  4. ops environment/path.yaml terraform plan
  5. Generate a valid plan file.

Actual Behaviour

Terraform dumps usage.

Reproduce Scenario (including but not limited to)

Have Terraform v0.12 installed.

Steps to Reproduce

  1. Install Python

  2. pip install https://github.com/adobe/ops-cli/releases/download/0.30/ops-0.30-py2-none-any.whl

  3. brew install terraform # CURRENT VERSION v0.12

  4. git clone https://github.com/adobe/ops-cli.git

  5. `cd

  6. ops environment/sample-cluster.yaml terraform plan

  7. Generate a valid plan file.

Platform and Version

MacOS 10.14.5
Terraform Version: v0.12.2 (or >v0.12-alpha3)
OPS-CLI Version: v0.30

Sample Code that illustrates the problem

Any valid operational config.

Logs taken while reproducing problem

cd ${HOME}/terraform && terraform get -update && terraform init && terraform refresh -input=false  -var 'cluster=aaron-kulick-dev'  -state=terraform.aaron-kulick-dev.tfstate && terraform plan -out=terraform.aaron-kulick-dev.plan -refresh=false -module-depth=1 -input=false  -var 'cluster=aaron-kulick-dev'  -state=terraform.aaron-kulick-dev.tfstate

N/A - c.f. CLI execution as debug output.

Allow ssh'ing into custom hostnames/IPs not in inventory

Expected Behaviour

ops cluster.yaml ssh 10.0.0.1
ssh'es into bastion--10.0.0.1 even when the ip
is not part of the inventory

This is needed in environments where the hosts are not part of the inventory
but reachable via bastion

Enable support for shared resources

I think we should introduce the shared resource concept between clusters . I'm thinking here of hosts that are deployed in admin cluster and need to be used in the product cluster. For example Prometheus servers even if they are part of admin cluster when deploying the product artefacts we should find a way to deploy new alerts/recording rules as well without changing the cluster/playbook or run it twice against 2 different clusters.

Add escape_json filter

Useful when you want to pass a JSON in a terraform file.

resource "kubernetes_secret" "{{repo.name}}-docker-repo-secret" {
  metadata {
    name = "{{repo.name}}-docker-repo-secret"
  }

  data {
    ".dockerconfigjson" = "{{ macros.generate_docker_json_config(repo.repository, repo.credentials) | escape_json }}"
  }

  type = "kubernetes.io/dockerconfigjson"
}

Upgrade to ansible 2.7

Expected Behaviour

The 2.3.1.0 version of ansible is quite old and reported as having a security vulnerability by github.
We should upgrade to latest version as of today: 2.7.10

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Rate-Limited

These updates are currently rate-limited. Click on a checkbox below to force their creation now.

  • Update pypa/gh-action-pypi-publish digest to 3fbcf7c
  • Update dependency Jinja2 to v3.1.3
  • Update dependency himl to v0.15.2
  • Update dependency msrestazure to v0.6.4.post1
  • Update github/codeql-action action to v3
  • πŸ” Create all rate-limited PRs at once πŸ”

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

dockerfile
Dockerfile
  • python 3.11.4-alpine3.18
  • python 3.11.4-alpine3.18
github-actions
.github/workflows/codeql-analysis.yml
  • actions/checkout v4
  • github/codeql-action v2
  • github/codeql-action v2
  • github/codeql-action v2
.github/workflows/release.yml
  • actions/checkout v4
  • actions/setup-python v5
  • pypa/gh-action-pypi-publish c12cc61414480c03e10ea76e2a0a1a17d6c764e2
  • actions/checkout v4
  • docker/login-action v3
  • docker/metadata-action v4
  • docker/build-push-action v5
pip_requirements
requirements.txt
  • simpledi ==0.4.1
  • awscli ==1.29.12
  • boto3 ==1.28.12
  • boto ==2.49.0
  • ansible ==8.5.0
  • azure-common ==1.1.28
  • azure ==4.0.0
  • msrestazure ==0.6.4
  • Jinja2 ==3.1.2
  • hvac ==1.1.1
  • inflection ==0.5.1
  • kubernetes ==26.1.0
  • himl ==0.15.0
  • GitPython ==3.1.*

  • Check this box to trigger a request for Renovate to run again on this repository

Switch to more inclusive language

Could you please take some time in the next few days to make changes to some terminology in your repos and content as much as is possible:

  • Whitelist/blacklist to Allowed List and Blocked List (or Approve List/Deny List - some software uses this instead) respectively. Google and many developers are formalizing allowlist and blocklist. You might want to lobby for those terms to be used in the UI.
  • Master/Slave to master and replica (or subordinate, if that makes more sense) respectively.

If you cannot remove the term because the writing for example reflects the UI or the code, please make a note and send me an email to [email protected] so we can bring it that team’s attention. Thanks for your efforts in this matter.

getpass missing from sync

Expected Behaviour

Files get synced

Actual Behaviour

Loading cached inventory info from: ~/.ops/cache/432fa75899488401313802367b97c24b
Loading cached inventory info from: ~/.ops/cache/432fa75899488401313802367b97c24b
Looking for hosts for pattern 'tabcontroller'
Traceback (most recent call last):
File "/Users/complexsplit/.pyenv/versions/cns/bin/ops", line 10, in
sys.exit(run())
File "/Users/complexsplit/.pyenv/versions/2.7.10/envs/cns/lib/python2.7/site-packages/ops/main.py", line 151, in run
output = app_container.run()
File "/Users/complexsplit/.pyenv/versions/2.7.10/envs/cns/lib/python2.7/site-packages/ops/main.py", line 144, in run
return runner_instance.run(self.console_args)
File "/Users/complexsplit/.pyenv/versions/2.7.10/envs/cns/lib/python2.7/site-packages/ops/cli/sync.py", line 84, in run
ssh_user = self.cluster_config.get('ssh_user') or self.ops_config.get('ssh.user') or getpass.getuser()
NameError: global name 'getpass' is not defined

Reproduce Scenario (including but not limited to)

Steps to Reproduce

ops cluster.yaml sync 'foomachine01:/tmp/bar.log' /Users/complexsplit/Downloads

Platform and Version

MacOS 10.14.6. ops-cli 1.8

Sample Code that illustrates the problem

Logs taken while reproducing problem

Enforce PEP 8 style guide and other Python best practices

I have noticed quite a few implementations in Python that is not quite pythonic or that there are other ways to optimize its behavior. I have opened this issue to track changes in the code to enforce a uniform and readable style for the existing Python scripts.

https://www.python.org/dev/peps/pep-0008/

Sample Code that illustrates the problem

elif type(value) == type(None)

The above is problematic for two reasons. First, isinstance() is better than comparing types directly, and secondly, for singletons like None,

elif value is None

is sufficient and cleaner.

Publish docker image with ops-cli

Currently, installing all prerequisites for ops-cli can be a bit cumbersome (ie. nailing the python2 version). We're trying to simplify this. In the mean time, it would be nice to have the possibility to try out ops-cli with one click. To that end, having ops-cli and its prerequisites in a docker image would offer a quick setup.

$ docker run -it costimuraru/ops-cli:0.25 bash
[linuxbrew@docker ~]$ ops help
# usage: ops [-h] [--root-dir ROOT_DIR] [--verbose] [-e EXTRA_VARS]
#           cluster_config_path
#           {inventory,terraform,packer,ssh,play,run,sync,noop} ...

Execute helmfile within wrong kube context

Expected Behaviour

When ops path helmfile command is executed the EKS cluster kubeconfing needs to be generated.

Actual Behaviour

if cluster.fqdn key is missing from cluster configuration kubeconfig file is not generated and helmfile command runs in the existing kubernetes context.
This can lead to helmfile execution agains the wrong cluster.

Add support for Kops Cluster on AWS

First of all congratulation for OpsCli !
I think i might be relevent to capture all the user base that build cluster with kops today, even if EKS exist there is still a lot of user that stick with kops because more controle more customisation, less features or options on eks side and so on and so forth...

i would love to see ops support kops cluster that i just discover today searching for something completly different , indeed i was searching for templating solution for my Spinnaker pipeline .
Anyway Kudos for the promising tool.

Cheers

Migrate ec2inventory plugin to use boto3

[Migrated]

The upgrade to Boto3 is part of an ongoing initiative of having a single instance which coordinates deployments - with support for automatic deployments in pre-production environments.

Boto3 has built-in support for the default AWS Credentials Providers stack.

We need to set the cross-account trusting policies for our accounts, being then able to specify automatic role assumption in a cross-account context via ~/.aws/config.

It might now work out of the box, as it doesn't for aws-cli (see: aws/aws-cli#1604 and aws/aws-cli#1390).
In that case, we would need to do programatic role assumption - I was not been able to fully test this yet (permissions).

Note
This change is intended to be backward-compatible with the current setup - in other words, temporary/static credentials should work as before.

Enable plugin caching when running terraform commands

Expected Behaviour

Enable TF_PLUGIN_CACHE_DIR in docker container to speed up tf commands.

e.g.

export TF_PLUGIN_CACHE_DIR="$HOME/.terraform.d/plugin-cache"
mkdir -p $TF_PLUGIN_CACHE_DIR

Actual Behaviour

Currently, when using ops-cli to drive EKS clusters there are multiple tf commands ran.
For each of this run, tf downloads aws tf provider slowing the overall command.

- Downloading plugin for provider "aws" (hashicorp/aws) 2.30.0...

@costimuraru do you see any issue in enabling this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.