suse / doc-ses Goto Github PK

View Code? Open in Web Editor NEW

14.0 27.0 53.0 131.19 MB

Official SUSE Enterprise Storage documentation

Home Page: https://documentation.suse.com/ses/

License: Other

Shell 63.70% XSLT 36.30%

documentation suse storage enterprise

doc-ses's Issues

[doc] 6.4.3.1 Deploy Ceph Monitors and Ceph Managers

6.4.3.1 Deploy Ceph Monitors and Ceph Managers

https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-cephadm-day2-service-mon

After reading:
If there are five or more nodes in the cluster, we recommend deploying five MONs

To deploy new MONs, run:
ceph orch apply mon SHORT_MON_HOST_NAME

If MONs or MGRs are not on the same subnet, you need to append the subnet address. For example:

ceph orch apply mon ses-min1:10.1.2.0/24
ceph orch apply mon ses-min2:10.1.5.0/24
ceph orch apply mgr ses-min1:10.1.2.0/24
ceph orch apply mgr ses-min2:10.1.5.0/24

I got the idea that I should run ceph orch apply mon SHORT_MON_HOST_NAME to all nodes in my cluster and all would have mon deployed.
But what I got was that only the last node I ran the cmd was the mon, all the reamaing got nothing.
I think the documentation should mention that running ceph orch apply mon SHORT_MON_HOST_NAME will deploy to the node and remove from the previous.

At least I think the subnet example should not contain cmds for multiple nodes.

[doc] 4.1 SUSE Linux Enterprise Install and Validation of Base Performance - install only whats required

4.1 SUSE Linux Enterprise Install and Validation of Base Performance

https://documentation.suse.com/ses/6/single-html/ses-tuning/#tuning-sles

It is fine to install x-server as long as the default target is NOT graphical and X is only started on demand. In this case X would not require memory and CPU.

However - during install, upgrade and patch the more packages are installed the longer it takes and the more load can be caused.. so recommendation is to install only what is required and not more.. (might be minimal plus some packages)

test

Summary
Testing this

Steps To Reproduce
Click on EDIT and land here

Expected Results
Land here.

Actual Results
Worked.

Notes
Should I remove sentences like "Add any other context about the problem here." - or should they stay in? Can we make them as removal?

[doc] 3.1.1 Network Tuning - details about LACP settings - especially the hash mode

3.1.1 Network Tuning

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.2.6.7

We should add the recommendation to use hash mode 2 or 2+3.
(3+4 can cause problems)

Part II Cluster Deployment and Upgrade

Link To Section
https://documentation.suse.com/ses/6/single-html/ses-deployment/#ses-deployment

Summary
The description for "7 Customizing the Default Configuration" is incomplete and finishes with "whi..."

Include selected topics into SES7 Deployment Guide

2 Best Practices Introduction

    2.1 Preparing your Documentation
    2.2 Review the Environment
    2.3 Requirements Assessment
    2.4 Review the Design

3 Infrastructure Services Preparation

    3.1 Prepare Fault Tolerant Time Synchronisation Sources
    3.2 Ensure Proper Name Resolution

4 Server Installation Preparation

    4.1 Software Staging System
    4.2 Server Installation Framework
    4.3 Image Repository
    4.4 Helm Chart Repository
    4.5 Installing the Operating System
    4.6 Optional: Deploy CaaSP

5 Server Installation

    5.1 Hardware Preparation
    5.2 Data Center Preparation
    5.3 Update Firmware and BIOS
    5.4 Configure Firmware and BIOS Settings
    5.5 Configure Hardware Date and Time
    5.6 Configure Networking
    5.7 Deploy the Operating System

6 Server Configuration

    6.1 Configuration Management Using Salt

7 Verifying the Deployment

    7.1 Test the Deployment
    7.2 Verify Time Synchronization
    7.3 Verify Name Resolution
    7.4 Test Storage Devices
    7.5 Test Network Connections

8 Cluster Deployment

    8.1 Adjust Ceph Parameters
    8.2 Deploying with cephadm
    8.3 Deploying on top of CaaSP
    8.4 Cluster DNS

9 Ceph Configuration

    9.1 Adjust Ceph CRUSH Map
    9.2 Adjust Existing Pools
    9.3 Adjust Gateway Settings
    9.4 Verify Functionality

10 Testing Best Practices

    10.1 Documenting Testplans
    10.2 Functional Testing
    10.3 Fault Tolerance Testing
    10.4 Performance Testing

11 Operation Best Practices

    11.1 Adding and Removing Disks
    11.2 CRUSH Map Adjustments

12 Upgrade Practices

Add placement specification into day2 deployment section

UPDATE: recommend applying services by apply -i yaml_file instead of CLI specification
get insipred with
https://ceph.readthedocs.io/en/latest/mgr/orchestrator/#placement-specification

[doc] 18.2.2 Verify DNS

18.2.2 Verify DNS

https://documentation.suse.com/ses/6/html/ses-all/cha-ses-cifs.html#cephfs-ad-dns

Example for resolving an IP address to a hostname is missing the nslookup command.

[doc] 6.4.5 DriveGroups

6.4.5 DriveGroups

https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#drive-groups

In the simplest case this could be the 'rotational' flag (all solid-state drives are to be db_devices, all rotating ones data devices) or something more involved such as 'model' strings, or sizes.

Providing only rotational: true doesn't create WAL/DB devices on SSDs. @jschmid1 Can you confirm this, thanks!

DeepSea will provide code that translates these DriveGroups into actual device lists for inspection by the user.

DeepSea -> cephadm

monitoring: document Prometheus' security model

Link To Section

The documentation should be probably be added to the "Deployment" guide, as its content seems to fit there most. But maybe it makes sense to have it in the "Administration" guide as well.

The monitoring stack is automatically deployed alongside the Ceph cluster, when cephadm is used for the deployment and isn't explicitly told not to deploy the monitoring stack. So cephadm is the right context, though I do not see a Monitoring heading in the cephadm chapter in the SES7 documentation where that might fit.

Summary

We need to document a non-obvious security model of Prometheus (part of our monitoring stack) to make users aware of it.

Basic Example

Basically, the information contained in the note block of https://github.com/ceph/ceph/pull/34902/files#diff-e5580505067bb51bc581a57fbb91df28R391-R403 (marked in the link) needs to be added. It may make sense or be necessary to add the paragraph above it, too, to have a short introduction to that the monitoring topic.

Motivation

The security model and its implications may not be obvious and we hence should briefly describe it and have a reference to the full documentation.

Notes

"Monitors" or "Mons" are a component of Ceph and are not related to "Monitoring" where its about collecting metadata of the Ceph cluster and related machines to graphically visualize them and provide alerting.

[doc] 3.3.1 RocksDB and WAL - add the info that we do not use separate WAL in case of two device setups

3.3.1 RocksDB and WAL

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.2.8.2

In case we have slow (HDD) and fast (NVMe) - we do not create a separate WAL partition (WAL=0) and instead size the RocksDB 64 GB..

[doc] 19.2.1 Automatic Cache Sizing

19.2.1 Automatic Cache Sizing

https://susedoc.github.io/doc-ses/master/single-html/ses-admin/#config-auto-cache-sizing

Document does not provide guidance for selecting an osd_memory_target value. This is important, as this selection will further drive hardware requirements.

[doc] Figure 2.1: Network Overview - minor issues

Figure 2.1: Network Overview

http://docserv.nue.suse.com/documents/SES_6/ses-deployment/single-html/#network-overview-figure

I have a few problems with this diagram:

there's scrambled black at the top of the image (clipped text in screenshot?)
a spellcheck underline for RadosGW is present in the image
"RadosGW" naming should be consistent with the rest of the documentation. E.g. rgw, RADOS Gateway, etc.
OSDs listed multiple times. Should be OSDs once or OSD multiple times
box with iSCSI should be iSCSI Gateway,
CIFS box should be Samba Gateway
- similarly, CIFS client network should be SMB client network or just SMB network

[doc] 6.3.2.11 Verify Cluster Configuration

6.3.2.11 Verify Cluster Configuration

https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-cephadm-configure-verify

in 6.3.2.4 Specify First MON/MGR Node there is an important note telling you to also add the admin role to the bootstrap node.
I think this should be reflected in the ceph-salt config ls example.

Current:

ceph-salt config ls
o- / ............................................................... [...]
  o- ceph_cluster .................................................. [...]
  | o- minions .............................................. [Minions: 5]
  | | o- ses-master.example.com .................................. [admin]
  | | o- ses-min1.example.com ................................ [bootstrap]
  | | o- ses-min2.example.com ................................. [no roles]

Purposed:

ceph-salt config ls
o- / ............................................................... [...]
  o- ceph_cluster .................................................. [...]
  | o- minions .............................................. [Minions: 5]
  | | o- ses-master.example.com .................................. [admin]
  | | o- ses-min1.example.com ......................... [bootstrap, admin]
  | | o- ses-min2.example.com ................................. [no roles]

[doc] 20.1.2 Single Gateway Example

20.1.2 Single Gateway Example

http://docserv.nue.suse.com/documents/SES_6/ses-admin/single-html/#sec-ses-cifs-example

This walkthrough should include instructions for adding a user to the local smbpasswd file. E.g.:

smbpasswd -a smbuser

Where smbuser should already exist as a local unix user.

[doc] 4.1.2 Storage Performance - add fio command examples

4.1.2 Storage Performance

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.4.6

Similar to the iperf3 command we should also give a fio command example, here.

[doc] 15.13.2.7 Update the Ceph Configuration File

15.13.2.7 Update the Ceph Configuration File

https://documentation.suse.com/ses/6/single-html/ses-admin/#update-ceph-config-file

Could we add the exact files and adjustments required so that deepsea setup matches the ceph.conf adjustments required for this multi-zone RGW setup and where each RGW has the correct zone in the ceph.conf even after running through all deepsea stages?

(global.yml, rgw...conf,...)

[doc] 4.2.3 SSD Tuning - read got improved but what happened to write?

4.2.3 SSD Tuning

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.5.5

Could we add some details on how much performance improvement during read this tuning achieved in the described test cluster and how much performance loss was on the write side by changing these settings?

SES6 Administration Guide: strange lettering of Appendixes

Link To Section

SES6 Administration Guide, Table of Contents

Summary

The appendices are numbered strangely.

Steps To Reproduce

Examine the ToC, see the following:

A DeepSea Stage 1 Custom Example
B Ceph Maintenance Updates Based on Upstream 'Nautilus' Point Releases
Glossary
C Documentation Updates

    C.1 Maintenance update of SUSE Enterprise Storage 6 documentation
    C.2 June 2019 (Release of SUSE Enterprise Storage 6)

and note that the appendix "Glossary" (with no letter) is situated after Appendix B, and before Appendix C.

Expected Results

My expectation would be that the Appendices would have ascending letters, with no skipped letters. Also, if there are both lettered and unlettered appendices, the lettered ones would come first, and the unlettered ones second, instead of the unlettered ones being interspersed among the lettered ones.

6.4.4.1 Starting the Upgrade

Link To Section
Please provide a link to the section or chapter you are reporting this bug for.
susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-cephadm-configure-admin

Summary
A clear and concise description of what the bug is.

Missing additional invocation

Notes
Add any other context about the problem here.

If a new container is available, but the Ceph version remains the same, then

# ceph orch upgrade start REGISTRY_URL

such as

# ceph orch upgrade start registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph

will update the containers although both are 15.2.1

Side note: we are only mentioning the container upgrade here. What about the host upgrade? Do we expect the customer to do zypper up, use ceph-salt or something else?

[doc] 5.1 Obtaining Ceph Metrics - how to interprete the output of "ceph daemon osd.0 perf dump"?

5.1 Obtaining Ceph Metrics

https://documentation.suse.com/ses/6/single-html/ses-tuning/#tuning-obtaining-metrics

Here I read this:

it is helpful to have an understanding of the critical metrics to monitor and what they indicate

So this automatically generates the question - where can I get the explanations for this so I can understand the output of "ceph daemon osd.0 perf dump|less". I would have expected the relevant parameters and values being explained, here or getting a pointer to the place where I can find that information..

[doc] 6.4.5.1 Specification

6.4.5.1 Specification

https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#drive-groups-specs

For FileStore setups, drive_groups.yml can be as follows:

No filestore in SES7, this section can be dropped. We need to info user to migrate filestore OSDs to bluestore in upgrade section.

[doc] 2.3 Hardware and Software - SES 5.5 on SLES 15 SP1?

2.3 Hardware and Software

https://documentation.suse.com/ses/6/single-html/ses-tuning/#tuning-hardware-software

SUSE Enterprise Storage 5.5

I assume this SES 5.5 line is wrong and should read 6 ?

[doc] Figure 2.1: Ceph Dashboard Login Credentials

Figure 2.1: Ceph Dashboard Login Screen

https://susedoc.github.io/doc-ses/master/html/ses-admin/ceph-dashboard.html#id-1.4.4.2.5.2.4

Dashboard login credentials are no longer obtained with
salt-call grains.get dashboard_creds
you need to set the during basic cluster deployment with ceph-salt, or later on with ceph dashboard ac-user-set-password admin <new_password>

SES6: Explain how user can verify that OSDs really are encrypted

Folks using the documentation to deploy encrypted OSDs might want to convince themselves that the resulting OSDs are, in fact, encrypted. The SES6 documentation does not currently provide any guidance for this case.

After using DeepSea to deploy encrypted OSDs as described elsewhere in the documentation, here is a procedure for easily - and without too much disruption - verifying that the OSDs are encrypted.

Step 1

Check the output of ceph-volume lvm list (it should be run as root on the node where the OSDs in question are located):

# ceph-volume lvm list

====== osd.3 =======

  [block]       /dev/ceph-d9f09cf7-a2a4-4ddc-b5ab-b1fa4096f713/osd-data-71f62502-4c85-4944-9860-312241d41bb7

      block device              /dev/ceph-d9f09cf7-a2a4-4ddc-b5ab-b1fa4096f713/osd-data-71f62502-4c85-4944-9860-312241d41bb7
      block uuid                m5F10p-tUeo-6ZGP-UjxJ-X3cd-Ec5B-dNGXvG
      cephx lockbox secret      
      cluster fsid              413d9116-e4f6-4211-a53b-89aa219f1cf2
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  f8596bf7-000f-4186-9378-170b782359dc
      osd id                    3
      type                      block
      vdo                       0
      devices                   /dev/vdb

====== osd.7 =======

  [block]       /dev/ceph-38914e8d-f512-44a7-bbee-3c20a684753d/osd-data-0f385f9e-ce5c-45b9-917d-7f8c08537987

      block device              /dev/ceph-38914e8d-f512-44a7-bbee-3c20a684753d/osd-data-0f385f9e-ce5c-45b9-917d-7f8c08537987
      block uuid                1y3qcS-ZG01-Y7Z1-B3Kv-PLr6-jbm6-8B79g6
      cephx lockbox secret      
      cluster fsid              413d9116-e4f6-4211-a53b-89aa219f1cf2
      cluster name              ceph
      crush device class        None
      encrypted                 0
      osd fsid                  0f9a8002-4c81-4f5f-93a6-255252cac2c4
      osd id                    7
      type                      block
      vdo                       0
      devices                   /dev/vdc

Note the line that says encrypted 0 - this means the OSD is not encrypted. Here are the possible values:

      encrypted                 0  <- not encrypted
      encrypted                 1  <- encrypted

If you get the following error, it means the node where you are running the command does not have any OSDs on it:

# ceph-volume lvm list
No valid Ceph lvm devices found

OK, enough about ceph-volume lvm list, already! Let's say I have deployed a cluster with an OSD for which ceph-volume lvm list shows encrypted 1, so I know ceph-volume thinks that OSD is encrypted. But how can I gain more confidence that it's really encypted? For that, proceed to "Step 2".

Step 2

Ceph OSD encryption-at-rest relies on the Linux kernel's dm-crypt subsystem and the Linux Unified Key Setup ("LUKS"). When creating an encrypted OSD, ceph-volume creates an encrypted logical volume and saves the corresponding dm-crypt secret key in the Ceph Monitor data store. When the OSD is to be started, ceph-volume ensures the device is mounted, retrieves the dm-crypt secret key from the Ceph Monitors, and decrypts the underlying device. This creates a new device, containing the unencrypted data, and this is the device the Ceph OSD daemon is started on.

Since the OSD itself does not know whether the underlying logical volume is encrypted or not, there is no ceph osd command that will return this information. But it is possible to query LUKS for it, as follows.

First, get the device of the OSD logical volume you are interested in. This can be obtained from the ceph-volume lvm list output:

      block device              /dev/ceph-d9f09cf7-a2a4-4ddc-b5ab-b1fa4096f713/osd-data-71f62502-4c85-4944-9860-312241d41bb7

Then, dump the LUKS header from that device:

# cryptsetup luksDump OSD_BLOCK_DEVICE

Here is what the output looks like when the OSD is NOT encrypted:

Device /dev/ceph-38914e8d-f512-44a7-bbee-3c20a684753d/osd-data-0f385f9e-ce5c-45b9-917d-7f8c08537987 is not a valid LUKS device.

And when it IS encrypted:

master:~ # cryptsetup luksDump /dev/ceph-1ce61157-81be-427d-83ad-7337f05d8514/osd-data-89230c92-3ace-4685-97ff-6fa059cef63a
LUKS header information for /dev/ceph-1ce61157-81be-427d-83ad-7337f05d8514/osd-data-89230c92-3ace-4685-97ff-6fa059cef63a

Version:        1
Cipher name:    aes
Cipher mode:    xts-plain64
Hash spec:      sha256
Payload offset: 4096
MK bits:        256
MK digest:      e9 41 85 f1 1b a3 54 e2 48 6a dc c2 50 26 a5 3b 79 b0 f2 2e 
MK salt:        4c 8c 9d 1f 72 1a 88 6c 06 88 04 72 81 7b e4 bb 
                b1 70 e1 c2 7c c5 3b 30 6d f7 c8 9c 7c ca 22 7d 
MK iterations:  118940
UUID:           7675f03b-58e3-47f2-85fc-3bafcf1e589f

Key Slot 0: ENABLED
        Iterations:             1906500
        Salt:                   8f 1f 7f f4 eb 30 5a 22 a5 b4 14 07 cc da dc 48 
                                b5 e9 87 ef 3b 9b 24 72 59 ea 1a 0a ec 61 e6 42 
        Key material offset:    8
        AF stripes:             4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED

[doc] 5.5.1 MDS Tuning - Typo: ceph-daemon perf dump -> ceph daemon perf dump

5.5.1 MDS Tuning

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.4.12.3

I assume

ceph-daemon perf dump

should read

ceph daemon perf dump

[doc] 4.2.1 CPU Mitigations - explain how to do this via grub settings

4.2.1 CPU Mitigations

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.5.3

mitigations=off is set in the grub2.cfg via adjusting /etc/default/grub and executing grub2-install.. and reboot.

[doc] 19.5 Setting the Log Level

19.5 Setting the Log Level

https://documentation.suse.com/ses/6/html/ses-all/cha-ceph-nfsganesha.html#ceph-nfsganesha-loglevel

"You change the default debug level NIV_EVENT by editing the file /etc/sysconfig/nfs-ganesha"

This file doesn't exist, but /etc/sysconfig/ganesha exists. I think the /etc/sysconfig/ganesha is created in one of nfs-ganesha's rpm scriptlet. This was seen with nfs-ganesha-2.8.3+git0.d504d374e-3.3.1.x86_64

Add intro to cephadm

see https://ceph.readthedocs.io/en/latest/cephadm/

SES6: upstream Nautilus point releases section is duplicated, each time with a different letter

Links To Duplicated Sections

https://documentation.suse.com/ses/6/single-html/ses-admin/#id-1.3.10
https://documentation.suse.com/ses/6/single-html/ses-deployment/#id-1.4.7
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.7

Summary

The chapter/section "Ceph Maintenance Updates Based on Upstream 'Nautilus' Point Releases" appears in each documentation book. This is perhaps by design (though I personally would not do it this way), but we also have the Glossary which is also so duplicated.

The difference between "Glossary" and "Ceph Maintenance Updates Based on Upstream 'Nautilus' Point Releases" is that the former has no letter associated, while the latter gets an automatically assigned letter (Appendix B, Appendix D, etc.) which is different for each book. In my mind, this unique letter makes me expect the material to also be unique to the book in question.

Steps To Reproduce

Read Appendix B of Admin Guide
Read Appendix A of Deployment Guide
Read Appendix D of Tuning Guide
Note that they are all identical, except that each has a different letter

Expected Results

To ease maintenance and lessen overall confusion, I would expect that we would not duplicate entire sections, and that given material would appear in the documentation in only one place.

If we really want to duplicate this material and have it in each book, I would prefer to have it as an unnumbered appendix like the Glossary, which is also identical in each book.

[doc] 6.4.3.1 Deploy Ceph Monitors and Ceph Managers

6.4.3.1 Deploy Ceph Monitors and Ceph Managers

https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-cephadm-day2-service-mon

I guess this is a docs problem but may also be something we want to support feature wise. For now the docs are wrong at and it's following terminal examples:

If all MONs or MGRs are on the same subnet, specify the short host names of the MONs or MGRs you need to add:

If you try to use multiple hostnames then the command returns the following

enno1 not valid:  enno1 not one of 'true', 'false'
Invalid command: unused arguments: ['enno1']
orch apply [mon|mgr|rbd-mirror|crash|alertmanager|grafana|node-exporter|prometheus] [<placement>] [--unmanaged] :  Update the size or placement for a service or apply a large yaml spec
Error EINVAL: invalid command```

If I then try this with only one hostname, everything is working out correctly.

[doc] 18.2.9 Starting the Services assumes non-HA

18.2.9 Starting the Services

https://documentation.suse.com/ses/6/html/ses-all/cha-ses-cifs.html#cephfs-ad-services

As discussed earlier in #243 , the AD join procedure assumes that Samba services should be enabled / started via systemctl, but this is incorrect if Samba services are managed by Pacemaker, as is the case for High Availability configurations (https://documentation.suse.com/ses/6/html/ses-all/cha-ses-cifs.html#sec-ses-cifs-ha).

Cross Protocol Access

Hi,

in Section 12 Deployment Guide for SES 6 is the following warning:
"Native CephFS and NFS clients are not restricted by file locks obtained via Samba, and vice versa. Applications that rely on cross protocol file locking may experience data corruption if CephFS backed Samba share paths are accessed via other means."

Is this the only cross protocol issue we have or are there any other posibilities wich could lead to corrupt data? E.g. S3 Data accessed by CIFS...

Would be useful to provide a list with all possible cross protocol issues.

Link To Section
Please provide a link to the section or chapter you are reporting this bug for.

Summary
A clear and concise description of what the bug is.

Steps To Reproduce
Steps to reproduce the behavior.

Expected Results
A clear and concise description of what you expected to happen.

Actual Results
Explain what actually happened when the steps were executed, and if applicable, add screenshots to help explain your problem.

Notes
Add any other context about the problem here.

[doc] 5.3.1 Logging - add the warning that SUSE will not recommend to do this in production clusters

5.3.1 Logging

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.4.10.2

From my understanding we should not recommend to set this in any production environment this way.. and we should be very clear about this.

SES6 Deployment Guide: Add new sections to Appendix "A Ceph Maintenance Updates Based on Upstream 'Nautilus' Point Releases"

Link To Section

https://documentation.suse.com/ses/6/single-html/ses-deployment/#id-1.4.7

Summary

The last upstream Nautilus release mentioned in the appendix is 14.2.5. We need to add sections for 14.2.6, 14.2.7, 14.2.8, and 14.2.9

Text of the New Sections

14.2.6

This release fixed a ceph-mgr bug that caused mgr becoming unresponsive on larger clusters. SES users were never exposed to the bug.

14.2.7

This upstream release patched two security flaws:

CVE-2020-1699: a path traversal flaw in Ceph dashboard that could allow for potential information disclosure
CVE-2020-1700: a flaw in RGW beast frontend that could lead to denial of service from an unauthenticated client

In SES6, these flaws were patched in ceph 14.2.5.382+g8881d33957b.

14.2.8

In addition to bugfixes, this major upstream release brought a number of notable changes:

The default value of bluestore_min_alloc_size_ssd has been changed to 4K to improve performance across all workloads.

The following OSD memory config options related to bluestore cache autotuning can now
be configured during runtime:

  osd_memory_base (default: 768 MB)
  osd_memory_cache_min (default: 128 MB)
  osd_memory_expected_fragmentation (default: 0.15)
  osd_memory_target (default: 4 GB)

The above options can be set with:

  ceph config set osd <option> <value>

The MGR now accepts profile rbd and profile rbd-read-only user caps.
These caps can be used to provide users access to MGR-based RBD functionality
such as rbd perf image iostat and rbd perf image iotop.
The configuration value osd_calc_pg_upmaps_max_stddev used for upmap
balancing has been removed. Instead, use the mgr balancer config
upmap_max_deviation which now is an integer number of PGs of deviation
from the target PGs per OSD. This can be set with a command like
ceph config set mgr mgr/balancer/upmap_max_deviation 2. The default
upmap_max_deviation is 5. There are situations where crush rules
would not allow a pool to ever have completely balanced PGs. For example, if
crush requires 1 replica on each of 3 racks, but there are fewer OSDs in 1 of
the racks. In those cases, the configuration value can be increased.
CephFS: multiple active MDS forward scrub is now rejected. Scrub currently
only is permitted on a file system with a single rank. Reduce the ranks to one
via ceph fs set <fs_name> max_mds 1.
Ceph now refuses to create a file system with a default EC data pool. For
further explanation, see:
https://docs.ceph.com/docs/nautilus/cephfs/createfs/#creating-pools
Ceph will now issue a health warning if a RADOS pool has a pg_num value that is not a power of two. This can be fixed by adjusting the pool to a nearby power of two:
```
  ceph osd pool set <pool-name> pg_num <new-pg-num>
```
Alternatively, the warning can be silenced with:
```
  ceph config set global mon_warn_on_pool_pg_num_not_power_of_two false
```

14.2.9

This upstream release patched two security flaws:

CVE-2020-1759: Fixed nonce reuse in msgr V2 secure mode
CVE-2020-1760: Fixed XSS due to RGW GetObject header-splitting

In SES6, these flaws were patched in ceph 14.2.5.389+gb0f23ac248.

Notes

The new sections should be added to the top of the appendix, and in reverse order, to preserve the existing reverse ordering of the sections.

[doc] New subsection before 6.3.2.6 Configure Time Server

Insert new section advising users to set their default dashboard username & password:
ceph-salt config /cephadm_bootstrap/dashboard/username set admin ceph-salt config /cephadm_bootstrap/dashboard/password set admin

https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-cephadm-configure-ntp

[doc] 20.1.2 Samba Gateway Example missing kernel backed share instructions

20.1.2 Single Gateway Example

http://docserv.nue.suse.com/documents/SES_6/ses-admin/single-html/#sec-ses-cifs-example

We currently only provide instructions for using Samba with the ceph VFS backend. Instructions for using Samba atop a kernel CephFS mount should also be present.

SES6 Deployment: add msgr2 enablement to "Upgrade Ceph OSDs" section

Link To Section

https://documentation.suse.com/ses/6/single-html/ses-deployment/#upgrade-main-osd

Summary

Now that msgr2 is fully supported in SES6 [1], we need to instruct users to enable it at the appropriate point in the SES5.5 -> 6 upgrade procedure, using the instructions at [2].

[1] #310
[2] https://ceph.io/releases/v14-2-0-nautilus-released/#instructions

What Exact Changes To Make To the Upgrade Procedure

In the section "Upgrade Ceph OSDs" of the SES6 Deployment Guide, at step 9 all of the OSD nodes will have been upgraded and the user will be checking the cluster health. There, they will notice (and this is correctly reflected in the documentation) a health warning message:

  health: HEALTH_WARN
      3 monitors have not enabled msgr2

To address this and another issue, let us add new steps 10 and 11 (after step 9, but before the existing step 10, which will be renumbered as a result of inserting these new step):

Once the last OSD node has been upgraded, issue the following command
```
ceph osd require-osd-release nautilus
```
This disallows pre-SES6/Nautilus OSDs and enables all new SES6/Nautilus-only OSD functionality.
Enable the new v2 network protocol by issuing the following command:
```
ceph mon enable-msgr2
```
This will instruct all monitors that bind to the old default port 6789 for the legacy v1 Messenger protocol to also bind to the new 3300 v2 protocol port. To see if all monitors have been updated, do:
```
ceph mon dump
```
and verify that each monitor has both a v2: and v1: address listed.

[doc] 6.3.3 Deploy Cluster

6.3.3 Deploy Cluster

https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-cephadm-deploy

Ricardo Marques: "It would be great to document that, when reboot is enabled in ceph-salt config (it's enable by default), after package updates, minions will reboot automatically (if needed).
if salt master (the host where you are running the ceph-salt apply command) is part of the ceph cluster (is a /ceph_cluster/minion) and needs a reboot, this reboot should be done manually and after rebooting the master, the user must run ceph-salt apply again to continue the deployment"

Customizing disk_led.sls in DeepSea

Link To Section
Please provide a link to the section or chapter you are reporting this bug for.

http://docserv.suse.de/documents/SES_6/ses-admin/html/storage-tips.html#id-1.3.8.2.14.7

Summary
A clear and concise description of what the bug is.

The documentation is suggesting to override a file that is shipped by the installation package, because of that all changes will get lost after an update.

Steps To Reproduce
Steps to reproduce the behavior.

Expected Results
A clear and concise description of what you expected to happen.

The documentation should explain that it is possible to customize the default behavior of the disk_led.sls by using the DeepSea way (https://github.com/SUSE/DeepSea/blob/master/srv/pillar/ceph/README).

This is done for example here: http://docserv.suse.de/documents/SES_6/ses-admin/html/storage-troubleshooting.html#storage-bp-recover-clockskew

Actual Results
Explain what actually happened when the steps were executed, and if applicable, add screenshots to help explain your problem.

Notes
Add any other context about the problem here.

The documentation should mention this:

The default behavior can be customized by adding e.g. for the ledmon package and ledctl tool.

disk_led:
  cmd:
    ident:
      'on': ledctl locate='{device_file}'
      'off': ledctl locate_off='{device_file}'
    fault:
      'on': ledctl locate='{device_file}'
      'off': ledctl locate_off='{device_file}'

to /srv/pillar/ceph/stack/global.yml or any other YAML file mentioned in https://github.com/SUSE/DeepSea/blob/master/srv/pillar/ceph/README. If the customization should only apply to a special node (minion), then the file stack/ceph/minions/{{minion}}.yml needs to be used. Simply reference to the section where this DeepSea feature is explained.

SES6 Deployment Guide: add new "Final Steps" section to SES5.5->6 upgrade procedure

Link To Section

This issue is about adding a new section in between the following two sections:

https://documentation.suse.com/ses/6/single-html/ses-deployment/#upgrade-main-osd
https://documentation.suse.com/ses/6/single-html/ses-deployment/#filestore2bluestore

Summary

The purpose of the new section is to give users some tips on final steps they should take after completing the SES5.5 -> 6 upgrade, as described in the upstream Mimic -> Nautilus upgrade procedure [1].

[1] https://ceph.io/releases/v14-2-0-nautilus-released/#instructions

What Exact Changes to Make to the SES6 Deployment Guide

Immediately after existing section "6.12 Upgrade Ceph OSDs", and immediately before existing section "6.13 OSD Migration to BlueStore", insert a completely new section with the following content:

6.13 Final Steps

Final steps to take to complete the upgrade:

For each host that has been upgraded (not just OSD nodes, but also MON, MGR, MDS and Gateway nodes, as well as client hosts), you should update your ceph.conf file so that it either specifies no monitor port (if you are running the monitors on the default ports) or references
both the v2 and v1 addresses and ports explicitly. (Things will still work if only the v1 IP and port are listed, but each CLI instantiation or daemon will need to reconnect after learning that the monitors also speak the v2 protocol, slowing things down a bit and preventing a full transition to the v2 protocol.)
Finally, consider enabling the telemetry module to send anonymized usage statistics and crash information to the upstream Ceph developers. To see what would be reported (without actually sending any information to anyone):
```
 ceph mgr module enable telemetry

 ceph telemetry show
```
If you are comfortable with the high-level cluster metadata that will be reported, you can opt-in to automatically report it:
```
 ceph telemetry on
```

[doc] 6.3.2.10 Configure Cluster Update Behavior

6.3.2.10 Configure Cluster Update Behavior

https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-cephadm-configure-reboots

I would like to see an explanation or arguments why or in which situation it might make sense to disable updating the packages or prevent automatic node reboots during cluster deployment.

[doc] 5.5.2.1 Specification (Drive Group Specification)

5.5.2.1 Specification

https://documentation.suse.com/ses/6/single-html/ses-deployment/#ds-drive-groups-specs

This subsection presents two examples: one for a BlueStore and another for FileStore. The introductory text for the first example does not mention that it is for BlueStore, while the one for the second example does.

Let's fix this, perhaps by changing the first intro text from

/srv/salt/ceph/configuration/files/drive_groups.yml accepts the following options:

/srv/salt/ceph/configuration/files/drive_groups.yml can take one of two basic forms, depending on whether BlueStore or FileStore is to be used.

For BlueStore setups, drive_groups.yml can be as follows:

[doc] 4.1.1 Network Performance - explain a bit about the trunk

4.1.1 Network Performance

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.4.5

Depending on the hash mode and client vs. server bandwidth we might have to use multiple iperf in parallel to measure the full combined bandwidth possible. (single connection vs. multiple parallel connections)

[doc] 4.2.4.1.2 PCIe Bus Adjustment - how to make this persistent?

4.2.4.1.2 PCIe Bus Adjustment

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.5.6.4.3

salt cmd.run just adjusts settings "on the fly" and after a reboot they are back on default. So when using these tunings we also should advice on how to make this persistent so it survives a reboot.

(this needs to be done for the whole document - so search for all salt '*' cmd.run.. and add the info how to make this persistent)

[doc] 4.2.4.1.1 Jumbo Frames - MTU needs to be set for all the respective devices in their ifcfg files.

4.2.4.1.1 Jumbo Frames

https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.5.6.4.2

To set this persistently, utilize YaST to set the MTU for the bonded interface.

should be adjusted to

To set this persistently, utilize YaST to set the MTU for all the used network interfaces or adjust the /etc/sysconfig/network/ifcfg-* files MTU settings and restart the network.

Please ensure to have the Jumbo frames enabled on all switches, routers and network cards that will transport these packets.

[doc] 6.2 Deploy Salt

6.2 Deploy Salt

https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-salt

Under View each minion's fingerprint: the terminal reads root@master. This is inconsistent with the line above that this needs to be executed on a minion.

Executing the example on the master does not give any output, executing this on the minion does indeed give me the desired and shown output.

[doc] 18.2.9 Starting the Services

18.2.9 Starting the Services

https://documentation.suse.com/ses/6/html/ses-all/cha-ses-cifs.html#cephfs-ad-services

As of SLES15SP1, the names are 'smb.service' , 'nmb.service' and 'winbind.service'.

[doc] 6.8.1 Manual Node Upgrade Using the Installer DVD

6.8.1 Manual Node Upgrade Using the Installer DVD

https://documentation.suse.com/ses/6/single-html/ses-deployment/#upgrade-one-node-manual

The next quarterly update medium - to be released next week - will contain a fix for https://bugzilla.suse.com/show_bug.cgi?id=1162545, see also https://download.suse.com/Download?buildid=0Hf9s7nDqOw~

The change adds an option "YAST_ACTIVATE_LUKS=0" that can be given on the YaST command line when booting from DVD so that no password is asked for for encrypted disks.

Let's document the option and state that this needs the Quarterly update from April 2020.

SES6: msgr V2 no longer "Tech Preview" (release notes)

Link To Section

https://www.suse.com/releasenotes/x86_64/SUSE-Enterprise-Storage/6/#Intro.Support.Techpreviews

Summary

As of the next SES6 documentation maintenance update (the one that will deliver the related doc changes), msgr V2 is no longer considered a Tech Preview, but rather a fully supported feature of SES6.

What This Means for the SES6 Release Notes

In the "Tech Previews" section, the following "Ceph Core" bullet point should be dropped:

New network protocol, or “wire protocol” version 2 (enhancing current protocol, version 1), adding support for on-the-wire encryption. Ceph daemons use this protocoll internally to communicate with one another (called the Messenger and often abbreviated to "msgr").

In the immediately following section ("Features"), a new subsection should be added as part of "4.6 RADOS (Ceph core)". It should look like this:

Messenger V2:

SES6 brings a new network protocol, or “wire protocol” version 2 (enhancing the current protocol, version 1). The new protocol brings support for on-the-wire encryption. Ceph daemons use this protocol (called the Messenger and often abbreviated to "msgr") internally to communicate with one another.

suse / doc-ses Goto Github PK

doc-ses's Issues

SUSE Enterprise Storage 5.5

Step 1

Step 2

Recommend Projects

Recommend Topics

Recommend Org