redhat-cip / puppet-ceph Goto Github PK

View Code? Open in Web Editor NEW

68.0 39.0 73.0 1.2 MB

Deploy Ceph using puppet

Home Page: http://ceph.com/

License: Other

Ruby 43.97% Shell 6.86% Puppet 43.18% HTML 5.98%

puppet-ceph's Introduction

puppet-ceph

About

This is a Puppet module to install a Ceph cluster.

Status

Developped/tested on Debian GNU/Linux Wheezy, targetting the Bobtail Ceph release.

Features

Ceph package ✓
Ceph MONs ✓
- MON configuration ✓
- MON service key ✓
- MON filesystem creation ✓
- MON service ✓
- MON cluster ✓
- admin key ✓
Ceph OSDs ✓
- OSD configuration ✓
- OSD bootstrap key ✓
- OSD device ✓
  - OSD device formatting ✓
  - OSD device mounting ✓
  - OSD filesystem creation ✓
  - OSD service key ✓
- OSD service ✓
- OSD registration ✓
- Insert OSD into crushmap ✓
- Working OSD ✓

TODO

Finish writing the rspec tests
Better OSD device placement possibilities
Test/finish MDS/RadosGW code

Contributing

Contributions are welcome, just fork on GitHub and send a pull-request !

Please try to keep each new feature / change / fix in a separate pull-request, it will greatly help speeding the merge of the pull-requests ;)
When adding features, don't forget to add unit tests.
puppet-lint (https://github.com/rodjek/puppet-lint) should not produce too much errors too :)
Some repo-collaborators might ask you some questions or to update your code. All these questions/requests are open to discussion, but please answer, dont just drop your code & abandon it after !

Using

To install a Ceph cluster you'll need at least one host to act as a MON and with the current crushmap defaults two hosts to act as OSDs. (The MON might be the same as an OSD, but has not been tested yet). And of course one puppetmaster :-)

This module requires the puppet master to have storeconfigs = true set and a storage backend configured. On the puppet agents pluginsync = true is required too.

Minimum Puppet manifest for all members of the Ceph cluster

A Ceph cluster needs a cluster fsid : get one with uuidgen -r (Install it with apt-get install uuid-runtime):

$fsid ='some uuid from uuidgen -r'

The general configuration:

class { 'ceph::conf':
  fsid            => $fsid,
  auth_type       => 'cephx', # Currently only cephx is supported AND required
  cluster_network => '10.0.0.0/24', # The cluster's «internal» network
  public_network  => '192.168.0.0/24', # The cluster's «public» (where clients are) network
}

APT configuration to install from the official Ceph repositories:

include ceph::apt::ceph

Puppet manifest for a MON

A MON host also needs the MONs secret : get it with ceph-authtool --create /path/to/keyring --gen-key -n mon.:

$mon_secret = 'AQD7kyJQQGoOBhAAqrPAqSopSwPrrfMMomzVdw=='

An Id:

$id = 0 # must be unique for each MON in the cluster

And the mon declaration:

ceph::mon { $id:
  monitor_secret => $mon_secret,
  mon_addr       => '192.168.0.10', # The host's «public» IP address
}

Then on ONLY ONE MON, export the admin key (required by the OSDs):

if !empty($::ceph_admin_key) {
  @@ceph::key { 'admin':
    secret       => $::ceph_admin_key,
    keyring_path => '/etc/ceph/keyring',
  }
}

NOTE: The puppet agent needs to be ran 3 times for the MON to be up and the admin key exported.

Puppet manifest for an OSD

An OSD host also needs the global host configuration for OSDs:

class { 'ceph::osd':
  public_address  => '192.168.0.100',
  cluster_address => '10.0.0.100',
}

And for each disk/device the path of the physical device to format:

ceph::osd::device { '/dev/sdb': }

WARNING: this previous step will trash all the data from your disk !!!

On an OSD, the puppet agent must be ran at least 4 times for the OSD to be formatted, registered on the OSDs and in the crushmap.

Testing

Using Vagrant

Clone the repo & enter the created directory

git clone git://github.com/enovance/puppet-ceph.git
cd puppet-ceph

Launch three MONs

vagrant up mon0
vagrant up mon1
vagrant up mon2

Run puppet one more time to update the ceph configuration (uses exported resources)

vagrant ssh mon0 -c 'sudo puppet agent -vt'
vagrant ssh mon1 -c 'sudo puppet agent -vt'
vagrant ssh mon2 -c 'sudo puppet agent -vt'

Ceph MONs should be up

vagrant ssh mon0 -c "sudo ceph mon stat"
    e3: 3 mons at {0=192.168.251.10:6789/0,1=192.168.251.11:6789/0,2=192.168.251.12:6789/0}, election epoch 4, quorum 0,1 0,1

Launch at least 2 OSDs

vagrant up osd0
vagrant up osd1
vagrant up osd2

Now login on mon0 (for example) & check ceph health

vagrant ssh mon0 -c 'sudo ceph -s'
   health HEALTH_OK
   monmap e2: 2 mons at {0=192.168.252.10:6789/0,1=192.168.252.11:6789/0}, election epoch 4, quorum 0,1 0,1
   osdmap e35: 6 osds: 6 up, 6 in
    pgmap v158: 192 pgs: 192 active+clean; 0 bytes data, 242 MB used, 23601 MB / 23844 MB avail

puppet-ceph's People

Contributors

Stargazers

Watchers

puppet-ceph's Issues

close

Mon id as variable

Trying to setup monitor-only role

$id=fqdn_rand(7)
$mon_key = "AQCXR1tUQK5DHhAAEp/Nfj/tGBWbYoXvHB586w=="   

    ceph::mon { id:
        monitor_secret => $mon_key,
        #mon_addr       => '192.168.0.10', # The host's «public» IP address
    }

generates such ceph.conf

[mon]
mon data = /var/lib/ceph/mon/mon.$id

not

[mon]
mon data = /var/lib/ceph/mon/mon.7

changes to single osd restart all osds on a server, and naming /dev/sdX devices

Today we have been testing drive failing/replacement and noticed a couple short-comings in the device.pp manifest:

When a disk is replaced, the ceph.conf will change and this results in a service restart of all the osd's in a server. (because of the subscribe => Concat /etc/ceph/ceph.conf in each osd service). These restarts result in a noticeable disruption. Ideally we want only to start the affected service, not all of them!
Using the /dev/sdX names for disks isn't ideal, since when a replacement drive is inserted it will get a new name (e.g. today we pulled sdq, then reinserted it and it got sdab). We then need to do one of
(a) change our host manifests to add osd::device (sdab), but this isn't good since the device will return to sdq after a reboot, or
(b) reboot the server, to get the device called sdq once again.

Do people have experience already with better practices to prevent these two problems?? Help is much appreciated!

Cheers, Dan
CERN IT

Could not find declared class ceph::conf

Hello, i am install you plugin, and use your example file,but get error

Simplify keys creation

Hi,

I suggest to get ride of the ceph-authool command and thus to simplify the key creation.

$ ceph -n mon. -k /var/lib/ceph/mon/mon.0/keyring auth get-or-create client.admin mon 'allow *' osd 'allow *' mds allow | tee -a /etc/ceph/keyring

Cheers!

SSL Error: Can't download images via vagrant up

I'm unable to download the image. There is some issue with the SSL Cert. I getting this error messages:

vagrant up mon0
There were warnings and/or errors while loading your Vagrantfile.
Your Vagrantfile was written for an earlier version of Vagrant,
and while Vagrant does the best it can to remain backwards
compatible, there are some cases where things have changed
significantly enough to warrant a message. These messages are
shown below.

Warnings:
* `config.vm.customize` calls are VirtualBox-specific. If you're
using any other provider, you'll have to use config.vm.provider in a
v2 configuration block.

Bringing machine 'mon0' up with 'virtualbox' provider...
[mon0] Box 'wheezy' was not found. Fetching box from specified URL for
the provider 'virtualbox'. Note that if the URL does not have
a box for this provider, you should interrupt Vagrant now and add
the box yourself. Otherwise Vagrant will attempt to download the
full box prior to discovering this error.
Downloading or copying the box...
An error occurred while downloading the remote file. The error
message, if any, is reproduced below. Please fix this error and try
again.

SSL certificate problem: unable to get local issuer certificate
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option

For the moment is the workaround that did work to add the box by hand.

vagrant box add --insecure wheezy http://labs.enovance.com/pub/wheezy.box
Downloading or copying the box...
Extracting box...te: 2590k/s, Estimated time remaining: --:--:--)
Successfully added box 'wheezy' with provider 'virtualbox'!

But the SSL Error has to go, need to clone and want to run this. Great when http is pointing to the https site, but only when the Cert is valid, any idea how to solve?

Greetings 4k3nd0

ceph_osd_bootstrap_key breaks facter when blkid is not installed

Installing puppet-ceph as of commit 77c8db9 breaks facter.

francois@puppet:~$ sudo facter -p --debug --trace
value for macaddress is still nil
value for macaddress is still nil
Not an EC2 host
/var/lib/puppet/lib/facter/ceph_osd_bootstrap_key.rb:35: undefined method `each' for nil:NilClass (NoMethodError)
        from /usr/lib/ruby/vendor_ruby/facter/util/loader.rb:81:in `load'
        from /usr/lib/ruby/vendor_ruby/facter/util/loader.rb:81:in `load_file'
        from /usr/lib/ruby/vendor_ruby/facter/util/loader.rb:43:in `load_all'
        from /usr/lib/ruby/vendor_ruby/facter/util/loader.rb:38:in `each'
        from /usr/lib/ruby/vendor_ruby/facter/util/loader.rb:38:in `load_all'
        from /usr/lib/ruby/vendor_ruby/facter/util/loader.rb:35:in `each'
        from /usr/lib/ruby/vendor_ruby/facter/util/loader.rb:35:in `load_all'
        from /usr/lib/ruby/vendor_ruby/facter/util/collection.rb:93:in `load_all'
        from /usr/lib/ruby/vendor_ruby/facter.rb:100:in `to_hash'
        from /usr/lib/ruby/vendor_ruby/facter/application.rb:26:in `run'
        from /usr/bin/facter:71
francois@puppet:~$ $ facter --version
1.6.18
francois@puppet:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 6.0.7 (squeeze)
Release:        6.0.7
Codename:       squeeze
francois@puppet:~$

ceph service not enabled

The ceph service isn't enabled so after a reboot ceph will not work anymore.

adding osd and monitors on the same host fails

On adding the following manifest file, it fails for Key['admin'].

node 'ubuntu-vanilla' {

        class { 'apt': update_timeout => '5000' }
        class { 'ceph::apt::ceph' : release => 'dumpling' }

          if !empty($::ceph_admin_key) {
            @@ceph::key { 'admin':
              secret       => $::ceph_admin_key,
              keyring_path => '/etc/ceph/keyring',
            }
          }
        class { 'role_ceph_mon': id => 'ubuntu-vanilla' }
  ceph::osd::device { '/dev/sdb':  journalsize => '200' }
  ceph::osd::device { '/dev/sdc': journalsize => '200'}
  ceph::osd::device { '/dev/sdd': journalsize => '200'}
}

I got following message
message: "Could not retrieve catalog from remote server: Error 400 on SERVER: Another local or imported resource exists with the type and title Ceph::Key[admin] on node ubuntu-vanilla"

After this I made following changes, at-least it allowed adding OSDs.

diff -r ./osd/device.pp /home/dgautam/puppet-ceph/manifests/osd/device.pp
55d54
<   #if $blkid != 'undefined'  and defined( Ceph::Key['admin'] ){
59c58
<       #require => Ceph::Key['admin'],

---
>       require => Ceph::Key['admin'],
diff -r ./osd.pp /home/dgautam/puppet-ceph/manifests/osd.pp
30c30
<   #Package['ceph'] -> Ceph::Key <<| title == 'admin' |>>

---
>   Package['ceph'] -> Ceph::Key <<| title == 'admin' |>>
[root@puppet manifests]$

Being new to puppet/ceph, I am not sure on the correct solution.

update ceph::client docs to state that keyname_int is required in the keys hash

The comments in ceph::client say:

# [*keys*]
#   (required) List of pairs of key names & secrets.
#   At least a key named 'admin' should be passed.
#   Example: {
#              'admin' => {
#                secret       => 'secretadmin'
#                keyring_path => '/etc/ceph/ceph.client.admin.keyring'
#              },
#              'client1' => {
#                secret       => 'secretclient1',
#                keyring_path => '/etc/ceph/ceph.client.client1.keyring'
#              }
#            }
#   Note: if path is ommited, the current default from enovance/puppet-ceph
#   will be used: '/var/lib/ceph/tmp/${name}.keyring' which might be unsafe

Yet ceph::key has:

define ceph::key (
  $secret,
  $keyring_path = "/var/lib/ceph/tmp/${name}.keyring",
  $keyname_int,
) {

so keyname_int is required.

Though you could fix the comments, I personally think that keyname_int could default to the $name or $title of the resource. I feel like I'm specifying that value multiple times. Here's my example:


$ceph_keys = {'xyzkey' => {
                                keyring_path => '/etc/ceph/ceph.client.xyzkey.keyring',
                                secret  => hiera('xyzkey::ceph::secret'),
                                keyname_int =>  'xyzkey'
                                }
                             }

Note how 'xyzkey' is repeated several times. I think at least keyname_int could default to the name of the resource.

YUM doesn't support el7 / rhel7 and fc20

Looking at: https://github.com/redhat-cip/puppet-ceph/blob/master/manifests/yum/ceph.pp

el6 is hard coded in the manifest, I'll a submit a patch this week if I have enough time!

Regards,

Documentation does not hint at needing puppetdb

ceph::conf::mon cannot get imported without using puppetdb. This is mentioned nowhere.

Release on forge.puppetlabs.com ?

Are you planning to release the enovance-ceph module on forge.puppetlabs.com ?
Would be great to have this module there available so it can be installed with the "puppet module" command.

ceph-osd-crush- and ceph-osd-register- run every time

Is there some reason you didn't add an "unless" to the ceph-osd-crush-* and ceph-osd-register-* Exec's in device.pp?? As it is now the auth add and crush set is executed at every puppet instance for each osd device... which in our case is once per 10 minutes at the moment.

If you see nothing wrong with it, I'd like to add something like:

ceph-osd-register...:
unless => 'ceph auth list | grep osd.${osd_id}'

ceph-osd-crush...:
unless => '... to be determined ;) '

Cheers, Dan

OSD disk reuse

The first install of an OSD works. Subsequent installs work, but the OSD fails to see the disk as usable. Possibly some remnant data? I'm still researching the issue. It may be a bug in ceph itself, or perhaps we need to dev/zero the drive?

keyring dont create

hello,to unknown reasons to me does not create a keyring for osd here is an example site.pp

node 'ceph-mds1.test' {
    class { 'ceph_mds': id => 1 }

 node 'ceph-mon0.test' {
  if !empty($::ceph_admin_key) {
    @@ceph::key { 'admin':
      secret       => $::ceph_admin_key,
      keyring_path => '/etc/ceph/keyring',
    }
  }
  class { 'role_ceph_mon': id => 0 }
}

node 'ceph-mon1.test' {
  class { 'role_ceph_mon': id => 1 }
}   
node 'tnode01' {

  class { 'role_ceph':
    fsid      => $::fsid,
    auth_type => 'cephx',
  }

  class { 'ceph::osd' :
    public_address  => $ipaddress_eth0,
    cluster_address => $ipaddress_eth0,
  }

  ceph::osd::device { '/dev/sdb': }

}

cmds path issues

'ceph-authtool /var/lib/ceph/tmp/keyring.mon.6 --create-keyring --name=mon. --add-key='AQCXR1tUQK5DHhAAEp/Nfj/tGBWbYoXvHB586w==' --cap mon 'allow *'' is not qualified and no path was specified. Please qualify the command or specify a path

issue with ceph::conf variables and future parser

With future parser and puppet 4.0 strict variable types are enforced. The integers in the ceph::conf are compared to single quoted values. Due to future parser and puppet 4.0 changes the variable types are strict and must match during expressions.

Variables that need quoting:

$pool_default_size
$pool_default_pg_num
$pool_default_pgp_num

The $conf_group variable in ceph::conf needs to be quoted due to a deprication in puppet and file, user, and group modes.

Dependency error when the host has MON & OSD roles

The error is

err: Failed to apply catalog: Could not find dependency Ceph::Key[admin] for Exec[ceph_osd_create_sdb] at /etc/puppet/environments/eno-ceph/modules/ceph/manifests/osd/device.pp:56

Unable to create filesystem on dmcrypt devices

Hello guys,

I do run in some issue with including a devices as a OSD. I want to that the block devices to be encrypted. I unlock this via cryptsetup luksOpen /dev/sdb OSD-1. But the when ceph::osd::devices is creating a gpt on it, it assumes that the new partition can be access via ${name}1, but the block devices is offering it via /dev/mapper/OSD-1p1.

Here the piece of code that have to change:

  exec { "mkfs_${devname}":
    command => "mkfs.xfs -f -d agcount=${::processorcount} -l \
size=1024m -n size=64k ${name}1",
    unless  => "xfs_admin -l ${name}",
    require => [Package['xfsprogs'], Exec["mkpart_${devname}"]],
  }

I guess it would be better to read the printable of parted and then try to find the partition. For the moment I disable the creation of the partition table, there is no need for now.

In general you don't need to create partition tables. You only have to be aware not to run a grub-install on the disk. This indeed would overwrite the Superblock. But are you planning to separate the OSD Journal to extra partition?

Further this devices isn't able to get a UUID via facter. It only display this:

Notice: /Stage[main]/cloud::Osd/Ceph::Osd::Device[/dev/mapper/OSD-0]/Notify[BLKID FACT OSD-0: blkid_uuid_OSD-01]/message: defined 'message' as 'BLKID FACT OSD-0: blkid_uuid_OSD-01'
Notice: BLKID OSD-0: undefined
Notice: /Stage[main]/cloud::Osd/Ceph::Osd::Device[/dev/mapper/OSD-0]/Notify[BLKID OSD-0: undefined]/message: defined 'message' as 'BLKID OSD-0: undefined'

I could a workaround using /sbin/blkid to getting the UUID of the dmcrypt devices. But maybe this issue is related to my VM? I do use Ubuntu 12.04 with ruby 1.8.7 and puppet 3.2.2.

I'll may work on a fork and try to fix this, maybe add in some parameter to handles this?
Like to enable gpt or not and some fine work. For example defining the inode size. For performance reason you should use them with 2048.

so far
4k3nd0

Missing tag 1.3.1 on github

Could you please tag release 1.3.1 on github?