suse / doc-ses Goto Github PK
View Code? Open in Web Editor NEWOfficial SUSE Enterprise Storage documentation
Home Page: https://documentation.suse.com/ses/
License: Other
Official SUSE Enterprise Storage documentation
Home Page: https://documentation.suse.com/ses/
License: Other
6.4.3.1 Deploy Ceph Monitors and Ceph Managers
After reading:
If there are five or more nodes in the cluster, we recommend deploying five MONs
To deploy new MONs, run:
ceph orch apply mon SHORT_MON_HOST_NAME
If MONs or MGRs are not on the same subnet, you need to append the subnet address. For example:
ceph orch apply mon ses-min1:10.1.2.0/24
ceph orch apply mon ses-min2:10.1.5.0/24
ceph orch apply mgr ses-min1:10.1.2.0/24
ceph orch apply mgr ses-min2:10.1.5.0/24
I got the idea that I should run ceph orch apply mon SHORT_MON_HOST_NAME
to all nodes in my cluster and all would have mon deployed.
But what I got was that only the last node I ran the cmd was the mon
, all the reamaing got nothing.
I think the documentation should mention that running ceph orch apply mon SHORT_MON_HOST_NAME
will deploy to the node and remove from the previous.
At least I think the subnet example should not contain cmds for multiple nodes.
4.1 SUSE Linux Enterprise Install and Validation of Base Performance
https://documentation.suse.com/ses/6/single-html/ses-tuning/#tuning-sles
It is fine to install x-server as long as the default target is NOT graphical and X is only started on demand. In this case X would not require memory and CPU.
However - during install, upgrade and patch the more packages are installed the longer it takes and the more load can be caused.. so recommendation is to install only what is required and not more.. (might be minimal plus some packages)
Summary
Testing this
Steps To Reproduce
Click on EDIT and land here
Expected Results
Land here.
Actual Results
Worked.
Notes
Should I remove sentences like "Add any other context about the problem here." - or should they stay in? Can we make them as removal?
3.1.1 Network Tuning
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.2.6.7
We should add the recommendation to use hash mode 2 or 2+3.
(3+4 can cause problems)
Link To Section
https://documentation.suse.com/ses/6/single-html/ses-deployment/#ses-deployment
Summary
The description for "7 Customizing the Default Configuration" is incomplete and finishes with "whi..."
2 Best Practices Introduction
2.1 Preparing your Documentation
2.2 Review the Environment
2.3 Requirements Assessment
2.4 Review the Design
3 Infrastructure Services Preparation
3.1 Prepare Fault Tolerant Time Synchronisation Sources
3.2 Ensure Proper Name Resolution
4 Server Installation Preparation
4.1 Software Staging System
4.2 Server Installation Framework
4.3 Image Repository
4.4 Helm Chart Repository
4.5 Installing the Operating System
4.6 Optional: Deploy CaaSP
5 Server Installation
5.1 Hardware Preparation
5.2 Data Center Preparation
5.3 Update Firmware and BIOS
5.4 Configure Firmware and BIOS Settings
5.5 Configure Hardware Date and Time
5.6 Configure Networking
5.7 Deploy the Operating System
6 Server Configuration
6.1 Configuration Management Using Salt
7 Verifying the Deployment
7.1 Test the Deployment
7.2 Verify Time Synchronization
7.3 Verify Name Resolution
7.4 Test Storage Devices
7.5 Test Network Connections
8 Cluster Deployment
8.1 Adjust Ceph Parameters
8.2 Deploying with cephadm
8.3 Deploying on top of CaaSP
8.4 Cluster DNS
9 Ceph Configuration
9.1 Adjust Ceph CRUSH Map
9.2 Adjust Existing Pools
9.3 Adjust Gateway Settings
9.4 Verify Functionality
10 Testing Best Practices
10.1 Documenting Testplans
10.2 Functional Testing
10.3 Fault Tolerance Testing
10.4 Performance Testing
11 Operation Best Practices
11.1 Adding and Removing Disks
11.2 CRUSH Map Adjustments
12 Upgrade Practices
UPDATE: recommend applying services by apply -i yaml_file
instead of CLI specification
get insipred with
https://ceph.readthedocs.io/en/latest/mgr/orchestrator/#placement-specification
18.2.2 Verify DNS
https://documentation.suse.com/ses/6/html/ses-all/cha-ses-cifs.html#cephfs-ad-dns
Example for resolving an IP address to a hostname is missing the nslookup
command.
6.4.5 DriveGroups
https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#drive-groups
In the simplest case this could be the 'rotational' flag (all solid-state drives are to be db_devices, all rotating ones data devices) or something more involved such as 'model' strings, or sizes.
Providing only rotational: true
doesn't create WAL/DB devices on SSDs. @jschmid1 Can you confirm this, thanks!
DeepSea will provide code that translates these DriveGroups into actual device lists for inspection by the user.
DeepSea -> cephadm
Link To Section
The documentation should be probably be added to the "Deployment" guide, as its content seems to fit there most. But maybe it makes sense to have it in the "Administration" guide as well.
The monitoring stack is automatically deployed alongside the Ceph cluster, when cephadm
is used for the deployment and isn't explicitly told not to deploy the monitoring stack. So cephadm
is the right context, though I do not see a Monitoring
heading in the cephadm
chapter in the SES7 documentation where that might fit.
Summary
We need to document a non-obvious security model of Prometheus (part of our monitoring stack) to make users aware of it.
Basic Example
Basically, the information contained in the note block of https://github.com/ceph/ceph/pull/34902/files#diff-e5580505067bb51bc581a57fbb91df28R391-R403 (marked in the link) needs to be added. It may make sense or be necessary to add the paragraph above it, too, to have a short introduction to that the monitoring topic.
Motivation
The security model and its implications may not be obvious and we hence should briefly describe it and have a reference to the full documentation.
Notes
"Monitors" or "Mons" are a component of Ceph and are not related to "Monitoring" where its about collecting metadata of the Ceph cluster and related machines to graphically visualize them and provide alerting.
3.3.1 RocksDB and WAL
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.2.8.2
In case we have slow (HDD) and fast (NVMe) - we do not create a separate WAL partition (WAL=0) and instead size the RocksDB 64 GB..
19.2.1 Automatic Cache Sizing
https://susedoc.github.io/doc-ses/master/single-html/ses-admin/#config-auto-cache-sizing
Document does not provide guidance for selecting an osd_memory_target value. This is important, as this selection will further drive hardware requirements.
Figure 2.1: Network Overview
http://docserv.nue.suse.com/documents/SES_6/ses-deployment/single-html/#network-overview-figure
I have a few problems with this diagram:
6.3.2.11 Verify Cluster Configuration
in 6.3.2.4 Specify First MON/MGR Node there is an important note telling you to also add the admin
role to the bootstrap
node.
I think this should be reflected in the ceph-salt config ls
example.
Current:
ceph-salt config ls
o- / ............................................................... [...]
o- ceph_cluster .................................................. [...]
| o- minions .............................................. [Minions: 5]
| | o- ses-master.example.com .................................. [admin]
| | o- ses-min1.example.com ................................ [bootstrap]
| | o- ses-min2.example.com ................................. [no roles]
Purposed:
ceph-salt config ls
o- / ............................................................... [...]
o- ceph_cluster .................................................. [...]
| o- minions .............................................. [Minions: 5]
| | o- ses-master.example.com .................................. [admin]
| | o- ses-min1.example.com ......................... [bootstrap, admin]
| | o- ses-min2.example.com ................................. [no roles]
20.1.2 Single Gateway Example
http://docserv.nue.suse.com/documents/SES_6/ses-admin/single-html/#sec-ses-cifs-example
This walkthrough should include instructions for adding a user to the local smbpasswd file. E.g.:
smbpasswd -a smbuser
Where smbuser
should already exist as a local unix user.
4.1.2 Storage Performance
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.4.6
Similar to the iperf3 command we should also give a fio command example, here.
15.13.2.7 Update the Ceph Configuration File
https://documentation.suse.com/ses/6/single-html/ses-admin/#update-ceph-config-file
Could we add the exact files and adjustments required so that deepsea setup matches the ceph.conf adjustments required for this multi-zone RGW setup and where each RGW has the correct zone in the ceph.conf even after running through all deepsea stages?
(global.yml, rgw...conf,...)
4.2.3 SSD Tuning
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.5.5
Could we add some details on how much performance improvement during read this tuning achieved in the described test cluster and how much performance loss was on the write side by changing these settings?
Link To Section
SES6 Administration Guide, Table of Contents
Summary
The appendices are numbered strangely.
Steps To Reproduce
Examine the ToC, see the following:
A DeepSea Stage 1 Custom Example
B Ceph Maintenance Updates Based on Upstream 'Nautilus' Point Releases
Glossary
C Documentation Updates
C.1 Maintenance update of SUSE Enterprise Storage 6 documentation
C.2 June 2019 (Release of SUSE Enterprise Storage 6)
and note that the appendix "Glossary" (with no letter) is situated after Appendix B, and before Appendix C.
Expected Results
My expectation would be that the Appendices would have ascending letters, with no skipped letters. Also, if there are both lettered and unlettered appendices, the lettered ones would come first, and the unlettered ones second, instead of the unlettered ones being interspersed among the lettered ones.
Link To Section
Please provide a link to the section or chapter you are reporting this bug for.
susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-cephadm-configure-admin
Summary
A clear and concise description of what the bug is.
Missing additional invocation
Notes
Add any other context about the problem here.
If a new container is available, but the Ceph version remains the same, then
# ceph orch upgrade start REGISTRY_URL
such as
# ceph orch upgrade start registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph
will update the containers although both are 15.2.1
Side note: we are only mentioning the container upgrade here. What about the host upgrade? Do we expect the customer to do zypper up
, use ceph-salt
or something else?
5.1 Obtaining Ceph Metrics
https://documentation.suse.com/ses/6/single-html/ses-tuning/#tuning-obtaining-metrics
Here I read this:
it is helpful to have an understanding of the critical metrics to monitor and what they indicateSo this automatically generates the question - where can I get the explanations for this so I can understand the output of "ceph daemon osd.0 perf dump|less". I would have expected the relevant parameters and values being explained, here or getting a pointer to the place where I can find that information..
6.4.5.1 Specification
https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#drive-groups-specs
For FileStore setups, drive_groups.yml can be as follows:
No filestore in SES7, this section can be dropped. We need to info user to migrate filestore OSDs to bluestore in upgrade section.
2.3 Hardware and Software
https://documentation.suse.com/ses/6/single-html/ses-tuning/#tuning-hardware-software
I assume this SES 5.5 line is wrong and should read 6 ?
Figure 2.1: Ceph Dashboard Login Screen
https://susedoc.github.io/doc-ses/master/html/ses-admin/ceph-dashboard.html#id-1.4.4.2.5.2.4
Dashboard login credentials are no longer obtained with
salt-call grains.get dashboard_creds
you need to set the during basic cluster deployment with ceph-salt
, or later on with ceph dashboard ac-user-set-password admin <new_password>
Folks using the documentation to deploy encrypted OSDs might want to convince themselves that the resulting OSDs are, in fact, encrypted. The SES6 documentation does not currently provide any guidance for this case.
After using DeepSea to deploy encrypted OSDs as described elsewhere in the documentation, here is a procedure for easily - and without too much disruption - verifying that the OSDs are encrypted.
Check the output of ceph-volume lvm list
(it should be run as root on the node where the OSDs in question are located):
# ceph-volume lvm list
====== osd.3 =======
[block] /dev/ceph-d9f09cf7-a2a4-4ddc-b5ab-b1fa4096f713/osd-data-71f62502-4c85-4944-9860-312241d41bb7
block device /dev/ceph-d9f09cf7-a2a4-4ddc-b5ab-b1fa4096f713/osd-data-71f62502-4c85-4944-9860-312241d41bb7
block uuid m5F10p-tUeo-6ZGP-UjxJ-X3cd-Ec5B-dNGXvG
cephx lockbox secret
cluster fsid 413d9116-e4f6-4211-a53b-89aa219f1cf2
cluster name ceph
crush device class None
encrypted 0
osd fsid f8596bf7-000f-4186-9378-170b782359dc
osd id 3
type block
vdo 0
devices /dev/vdb
====== osd.7 =======
[block] /dev/ceph-38914e8d-f512-44a7-bbee-3c20a684753d/osd-data-0f385f9e-ce5c-45b9-917d-7f8c08537987
block device /dev/ceph-38914e8d-f512-44a7-bbee-3c20a684753d/osd-data-0f385f9e-ce5c-45b9-917d-7f8c08537987
block uuid 1y3qcS-ZG01-Y7Z1-B3Kv-PLr6-jbm6-8B79g6
cephx lockbox secret
cluster fsid 413d9116-e4f6-4211-a53b-89aa219f1cf2
cluster name ceph
crush device class None
encrypted 0
osd fsid 0f9a8002-4c81-4f5f-93a6-255252cac2c4
osd id 7
type block
vdo 0
devices /dev/vdc
Note the line that says encrypted 0
- this means the OSD is not encrypted. Here are the possible values:
encrypted 0 <- not encrypted
encrypted 1 <- encrypted
If you get the following error, it means the node where you are running the command does not have any OSDs on it:
# ceph-volume lvm list
No valid Ceph lvm devices found
OK, enough about ceph-volume lvm list
, already! Let's say I have deployed a cluster with an OSD for which ceph-volume lvm list
shows encrypted 1
, so I know ceph-volume
thinks that OSD is encrypted. But how can I gain more confidence that it's really encypted? For that, proceed to "Step 2".
Ceph OSD encryption-at-rest relies on the Linux kernel's dm-crypt
subsystem and the Linux Unified Key Setup ("LUKS"). When creating an encrypted OSD, ceph-volume creates an encrypted logical volume and saves the corresponding dm-crypt secret key in the Ceph Monitor data store. When the OSD is to be started, ceph-volume
ensures the device is mounted, retrieves the dm-crypt secret key from the Ceph Monitors, and decrypts the underlying device. This creates a new device, containing the unencrypted data, and this is the device the Ceph OSD daemon is started on.
Since the OSD itself does not know whether the underlying logical volume is encrypted or not, there is no ceph osd
command that will return this information. But it is possible to query LUKS for it, as follows.
First, get the device of the OSD logical volume you are interested in. This can be obtained from the ceph-volume lvm list
output:
block device /dev/ceph-d9f09cf7-a2a4-4ddc-b5ab-b1fa4096f713/osd-data-71f62502-4c85-4944-9860-312241d41bb7
Then, dump the LUKS header from that device:
# cryptsetup luksDump OSD_BLOCK_DEVICE
Here is what the output looks like when the OSD is NOT encrypted:
Device /dev/ceph-38914e8d-f512-44a7-bbee-3c20a684753d/osd-data-0f385f9e-ce5c-45b9-917d-7f8c08537987 is not a valid LUKS device.
And when it IS encrypted:
master:~ # cryptsetup luksDump /dev/ceph-1ce61157-81be-427d-83ad-7337f05d8514/osd-data-89230c92-3ace-4685-97ff-6fa059cef63a
LUKS header information for /dev/ceph-1ce61157-81be-427d-83ad-7337f05d8514/osd-data-89230c92-3ace-4685-97ff-6fa059cef63a
Version: 1
Cipher name: aes
Cipher mode: xts-plain64
Hash spec: sha256
Payload offset: 4096
MK bits: 256
MK digest: e9 41 85 f1 1b a3 54 e2 48 6a dc c2 50 26 a5 3b 79 b0 f2 2e
MK salt: 4c 8c 9d 1f 72 1a 88 6c 06 88 04 72 81 7b e4 bb
b1 70 e1 c2 7c c5 3b 30 6d f7 c8 9c 7c ca 22 7d
MK iterations: 118940
UUID: 7675f03b-58e3-47f2-85fc-3bafcf1e589f
Key Slot 0: ENABLED
Iterations: 1906500
Salt: 8f 1f 7f f4 eb 30 5a 22 a5 b4 14 07 cc da dc 48
b5 e9 87 ef 3b 9b 24 72 59 ea 1a 0a ec 61 e6 42
Key material offset: 8
AF stripes: 4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED
5.5.1 MDS Tuning
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.4.12.3
I assume
ceph-daemon perf dump
should read
ceph daemon perf dump
4.2.1 CPU Mitigations
mitigations=off is set in the grub2.cfg via adjusting /etc/default/grub and executing grub2-install.. and reboot.
19.5 Setting the Log Level
https://documentation.suse.com/ses/6/html/ses-all/cha-ceph-nfsganesha.html#ceph-nfsganesha-loglevel
"You change the default debug level NIV_EVENT by editing the file /etc/sysconfig/nfs-ganesha"
This file doesn't exist, but /etc/sysconfig/ganesha exists. I think the /etc/sysconfig/ganesha is created in one of nfs-ganesha's rpm scriptlet. This was seen with nfs-ganesha-2.8.3+git0.d504d374e-3.3.1.x86_64
Links To Duplicated Sections
https://documentation.suse.com/ses/6/single-html/ses-admin/#id-1.3.10
https://documentation.suse.com/ses/6/single-html/ses-deployment/#id-1.4.7
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.7
Summary
The chapter/section "Ceph Maintenance Updates Based on Upstream 'Nautilus' Point Releases" appears in each documentation book. This is perhaps by design (though I personally would not do it this way), but we also have the Glossary which is also so duplicated.
The difference between "Glossary" and "Ceph Maintenance Updates Based on Upstream 'Nautilus' Point Releases" is that the former has no letter associated, while the latter gets an automatically assigned letter (Appendix B, Appendix D, etc.) which is different for each book. In my mind, this unique letter makes me expect the material to also be unique to the book in question.
Steps To Reproduce
Expected Results
To ease maintenance and lessen overall confusion, I would expect that we would not duplicate entire sections, and that given material would appear in the documentation in only one place.
If we really want to duplicate this material and have it in each book, I would prefer to have it as an unnumbered appendix like the Glossary, which is also identical in each book.
6.4.3.1 Deploy Ceph Monitors and Ceph Managers
I guess this is a docs problem but may also be something we want to support feature wise. For now the docs are wrong at and it's following terminal examples:
If all MONs or MGRs are on the same subnet, specify the short host names of the MONs or MGRs you need to add:
If you try to use multiple hostnames then the command returns the following
enno1 not valid: enno1 not one of 'true', 'false'
Invalid command: unused arguments: ['enno1']
orch apply [mon|mgr|rbd-mirror|crash|alertmanager|grafana|node-exporter|prometheus] [<placement>] [--unmanaged] : Update the size or placement for a service or apply a large yaml spec
Error EINVAL: invalid command```
If I then try this with only one hostname, everything is working out correctly.
18.2.9 Starting the Services
https://documentation.suse.com/ses/6/html/ses-all/cha-ses-cifs.html#cephfs-ad-services
As discussed earlier in #243 , the AD join procedure assumes that Samba services should be enabled / started via systemctl, but this is incorrect if Samba services are managed by Pacemaker, as is the case for High Availability configurations (https://documentation.suse.com/ses/6/html/ses-all/cha-ses-cifs.html#sec-ses-cifs-ha).
Hi,
in Section 12 Deployment Guide for SES 6 is the following warning:
"Native CephFS and NFS clients are not restricted by file locks obtained via Samba, and vice versa. Applications that rely on cross protocol file locking may experience data corruption if CephFS backed Samba share paths are accessed via other means."
Is this the only cross protocol issue we have or are there any other posibilities wich could lead to corrupt data? E.g. S3 Data accessed by CIFS...
Would be useful to provide a list with all possible cross protocol issues.
Link To Section
Please provide a link to the section or chapter you are reporting this bug for.
Summary
A clear and concise description of what the bug is.
Steps To Reproduce
Steps to reproduce the behavior.
Expected Results
A clear and concise description of what you expected to happen.
Actual Results
Explain what actually happened when the steps were executed, and if applicable, add screenshots to help explain your problem.
Notes
Add any other context about the problem here.
5.3.1 Logging
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.4.10.2
From my understanding we should not recommend to set this in any production environment this way.. and we should be very clear about this.
Link To Section
https://documentation.suse.com/ses/6/single-html/ses-deployment/#id-1.4.7
Summary
The last upstream Nautilus release mentioned in the appendix is 14.2.5. We need to add sections for 14.2.6, 14.2.7, 14.2.8, and 14.2.9
Text of the New Sections
14.2.6
This release fixed a ceph-mgr bug that caused mgr becoming unresponsive on larger clusters. SES users were never exposed to the bug.
14.2.7
This upstream release patched two security flaws:
In SES6, these flaws were patched in ceph 14.2.5.382+g8881d33957b.
14.2.8
In addition to bugfixes, this major upstream release brought a number of notable changes:
The default value of bluestore_min_alloc_size_ssd
has been changed to 4K to improve performance across all workloads.
The following OSD memory config options related to bluestore cache autotuning can now
be configured during runtime:
osd_memory_base (default: 768 MB)
osd_memory_cache_min (default: 128 MB)
osd_memory_expected_fragmentation (default: 0.15)
osd_memory_target (default: 4 GB)
The above options can be set with:
ceph config set osd <option> <value>
The MGR now accepts profile rbd
and profile rbd-read-only
user caps.
These caps can be used to provide users access to MGR-based RBD functionality
such as rbd perf image iostat
and rbd perf image iotop
.
The configuration value osd_calc_pg_upmaps_max_stddev
used for upmap
balancing has been removed. Instead, use the mgr balancer config
upmap_max_deviation
which now is an integer number of PGs of deviation
from the target PGs per OSD. This can be set with a command like
ceph config set mgr mgr/balancer/upmap_max_deviation 2
. The default
upmap_max_deviation
is 5. There are situations where crush rules
would not allow a pool to ever have completely balanced PGs. For example, if
crush requires 1 replica on each of 3 racks, but there are fewer OSDs in 1 of
the racks. In those cases, the configuration value can be increased.
CephFS: multiple active MDS forward scrub is now rejected. Scrub currently
only is permitted on a file system with a single rank. Reduce the ranks to one
via ceph fs set <fs_name> max_mds 1
.
Ceph now refuses to create a file system with a default EC data pool. For
further explanation, see:
https://docs.ceph.com/docs/nautilus/cephfs/createfs/#creating-pools
Ceph will now issue a health warning if a RADOS pool has a pg_num
value that is not a power of two. This can be fixed by adjusting the pool to a nearby power of two:
ceph osd pool set <pool-name> pg_num <new-pg-num>
Alternatively, the warning can be silenced with:
ceph config set global mon_warn_on_pool_pg_num_not_power_of_two false
14.2.9
This upstream release patched two security flaws:
In SES6, these flaws were patched in ceph 14.2.5.389+gb0f23ac248.
Notes
The new sections should be added to the top of the appendix, and in reverse order, to preserve the existing reverse ordering of the sections.
Insert new section advising users to set their default dashboard username & password:
ceph-salt config /cephadm_bootstrap/dashboard/username set admin ceph-salt config /cephadm_bootstrap/dashboard/password set admin
20.1.2 Single Gateway Example
http://docserv.nue.suse.com/documents/SES_6/ses-admin/single-html/#sec-ses-cifs-example
We currently only provide instructions for using Samba with the ceph VFS backend. Instructions for using Samba atop a kernel CephFS mount should also be present.
Link To Section
https://documentation.suse.com/ses/6/single-html/ses-deployment/#upgrade-main-osd
Summary
Now that msgr2 is fully supported in SES6 [1], we need to instruct users to enable it at the appropriate point in the SES5.5 -> 6 upgrade procedure, using the instructions at [2].
[1] #310
[2] https://ceph.io/releases/v14-2-0-nautilus-released/#instructions
What Exact Changes To Make To the Upgrade Procedure
In the section "Upgrade Ceph OSDs" of the SES6 Deployment Guide, at step 9 all of the OSD nodes will have been upgraded and the user will be checking the cluster health. There, they will notice (and this is correctly reflected in the documentation) a health warning message:
health: HEALTH_WARN
3 monitors have not enabled msgr2
To address this and another issue, let us add new steps 10 and 11 (after step 9, but before the existing step 10, which will be renumbered as a result of inserting these new step):
Once the last OSD node has been upgraded, issue the following command
ceph osd require-osd-release nautilus
This disallows pre-SES6/Nautilus OSDs and enables all new SES6/Nautilus-only OSD functionality.
Enable the new v2 network protocol by issuing the following command:
ceph mon enable-msgr2
This will instruct all monitors that bind to the old default port 6789 for the legacy v1
Messenger protocol to also bind to the new 3300 v2
protocol port. To see if all monitors have been updated, do:
ceph mon dump
and verify that each monitor has both a v2:
and v1:
address listed.
6.3.3 Deploy Cluster
Ricardo Marques: "It would be great to document that, when reboot is enabled in ceph-salt config (it's enable by default), after package updates, minions will reboot automatically (if needed).
if salt master (the host where you are running the ceph-salt apply command) is part of the ceph cluster (is a /ceph_cluster/minion) and needs a reboot, this reboot should be done manually and after rebooting the master, the user must run ceph-salt apply again to continue the deployment"
Link To Section
Please provide a link to the section or chapter you are reporting this bug for.
http://docserv.suse.de/documents/SES_6/ses-admin/html/storage-tips.html#id-1.3.8.2.14.7
Summary
A clear and concise description of what the bug is.
The documentation is suggesting to override a file that is shipped by the installation package, because of that all changes will get lost after an update.
Steps To Reproduce
Steps to reproduce the behavior.
Expected Results
A clear and concise description of what you expected to happen.
The documentation should explain that it is possible to customize the default behavior of the disk_led.sls by using the DeepSea way (https://github.com/SUSE/DeepSea/blob/master/srv/pillar/ceph/README).
This is done for example here: http://docserv.suse.de/documents/SES_6/ses-admin/html/storage-troubleshooting.html#storage-bp-recover-clockskew
Actual Results
Explain what actually happened when the steps were executed, and if applicable, add screenshots to help explain your problem.
Notes
Add any other context about the problem here.
The documentation should mention this:
The default behavior can be customized by adding e.g. for the ledmon package and ledctl tool.
disk_led:
cmd:
ident:
'on': ledctl locate='{device_file}'
'off': ledctl locate_off='{device_file}'
fault:
'on': ledctl locate='{device_file}'
'off': ledctl locate_off='{device_file}'
to /srv/pillar/ceph/stack/global.yml
or any other YAML file mentioned in https://github.com/SUSE/DeepSea/blob/master/srv/pillar/ceph/README. If the customization should only apply to a special node (minion), then the file stack/ceph/minions/{{minion}}.yml
needs to be used. Simply reference to the section where this DeepSea feature is explained.
Link To Section
This issue is about adding a new section in between the following two sections:
https://documentation.suse.com/ses/6/single-html/ses-deployment/#upgrade-main-osd
https://documentation.suse.com/ses/6/single-html/ses-deployment/#filestore2bluestore
Summary
The purpose of the new section is to give users some tips on final steps they should take after completing the SES5.5 -> 6 upgrade, as described in the upstream Mimic -> Nautilus upgrade procedure [1].
[1] https://ceph.io/releases/v14-2-0-nautilus-released/#instructions
What Exact Changes to Make to the SES6 Deployment Guide
Immediately after existing section "6.12 Upgrade Ceph OSDs", and immediately before existing section "6.13 OSD Migration to BlueStore", insert a completely new section with the following content:
6.13 Final Steps
Final steps to take to complete the upgrade:
For each host that has been upgraded (not just OSD nodes, but also MON, MGR, MDS and Gateway nodes, as well as client hosts), you should update your ceph.conf file so that it either specifies no monitor port (if you are running the monitors on the default ports) or references
both the v2
and v1
addresses and ports explicitly. (Things will still work if only the v1
IP and port are listed, but each CLI instantiation or daemon will need to reconnect after learning that the monitors also speak the v2
protocol, slowing things down a bit and preventing a full transition to the v2
protocol.)
Finally, consider enabling the telemetry module to send anonymized usage statistics and crash information to the upstream Ceph developers. To see what would be reported (without actually sending any information to anyone):
ceph mgr module enable telemetry
ceph telemetry show
If you are comfortable with the high-level cluster metadata that will be reported, you can opt-in to automatically report it:
ceph telemetry on
6.3.2.10 Configure Cluster Update Behavior
I would like to see an explanation or arguments why or in which situation it might make sense to disable updating the packages or prevent automatic node reboots during cluster deployment.
5.5.2.1 Specification
https://documentation.suse.com/ses/6/single-html/ses-deployment/#ds-drive-groups-specs
This subsection presents two examples: one for a BlueStore and another for FileStore. The introductory text for the first example does not mention that it is for BlueStore, while the one for the second example does.
Let's fix this, perhaps by changing the first intro text from
/srv/salt/ceph/configuration/files/drive_groups.yml accepts the following options:
to
/srv/salt/ceph/configuration/files/drive_groups.yml can take one of two basic forms, depending on whether BlueStore or FileStore is to be used.
For BlueStore setups, drive_groups.yml can be as follows:
4.1.1 Network Performance
Depending on the hash mode and client vs. server bandwidth we might have to use multiple iperf in parallel to measure the full combined bandwidth possible. (single connection vs. multiple parallel connections)
4.2.4.1.2 PCIe Bus Adjustment
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.5.6.4.3
salt cmd.run just adjusts settings "on the fly" and after a reboot they are back on default. So when using these tunings we also should advice on how to make this persistent so it survives a reboot.
(this needs to be done for the whole document - so search for all salt '*' cmd.run.. and add the info how to make this persistent)
4.2.4.1.1 Jumbo Frames
https://documentation.suse.com/ses/6/single-html/ses-tuning/#id-1.5.3.3.5.6.4.2
should be adjusted to
To set this persistently, utilize YaST to set the MTU for all the used network interfaces or adjust the /etc/sysconfig/network/ifcfg-* files MTU settings and restart the network.Please ensure to have the Jumbo frames enabled on all switches, routers and network cards that will transport these packets.
6.2 Deploy Salt
https://susedoc.github.io/doc-ses/master/html/ses-deployment/deploy-cephadm.html#deploy-salt
Under View each minion's fingerprint:
the terminal reads root@master
. This is inconsistent with the line above that this needs to be executed on a minion.
Executing the example on the master does not give any output, executing this on the minion does indeed give me the desired and shown output.
18.2.9 Starting the Services
https://documentation.suse.com/ses/6/html/ses-all/cha-ses-cifs.html#cephfs-ad-services
As of SLES15SP1, the names are 'smb.service' , 'nmb.service' and 'winbind.service'.
6.8.1 Manual Node Upgrade Using the Installer DVD
https://documentation.suse.com/ses/6/single-html/ses-deployment/#upgrade-one-node-manual
The next quarterly update medium - to be released next week - will contain a fix for https://bugzilla.suse.com/show_bug.cgi?id=1162545, see also https://download.suse.com/Download?buildid=0Hf9s7nDqOw~
The change adds an option "YAST_ACTIVATE_LUKS=0" that can be given on the YaST command line when booting from DVD so that no password is asked for for encrypted disks.
Let's document the option and state that this needs the Quarterly update from April 2020.
Link To Section
https://www.suse.com/releasenotes/x86_64/SUSE-Enterprise-Storage/6/#Intro.Support.Techpreviews
Summary
As of the next SES6 documentation maintenance update (the one that will deliver the related doc changes), msgr V2 is no longer considered a Tech Preview, but rather a fully supported feature of SES6.
What This Means for the SES6 Release Notes
In the "Tech Previews" section, the following "Ceph Core" bullet point should be dropped:
New network protocol, or “wire protocol” version 2 (enhancing current protocol, version 1), adding support for on-the-wire encryption. Ceph daemons use this protocoll internally to communicate with one another (called the Messenger and often abbreviated to "msgr").
In the immediately following section ("Features"), a new subsection should be added as part of "4.6 RADOS (Ceph core)". It should look like this:
Messenger V2:
SES6 brings a new network protocol, or “wire protocol” version 2 (enhancing the current protocol, version 1). The new protocol brings support for on-the-wire encryption. Ceph daemons use this protocol (called the Messenger and often abbreviated to "msgr") internally to communicate with one another.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.