Comments (7)
I am struggling to come up with a scenario where restoring nodes would ever be the right thing for our clients. Are there any specific scenarios you have run into where restoring the actual nodes would be "appropriate"? If we don't have a specific use case in mind, should we simply remove it's support for now and be open to adding it in with a specific use case to ensure that we get the implementation correct?
If you would like to keep it in, by "make it hard" were you thinking an explicit CLI flag that would have to be set that gets merged into the include
and exclude
options?
from velero.
I agree that it's probably unlikely that restoring a node would ever be something anyone would want to do. I have a few thoughts on how to make this hard:
- put
nodes
in the--exclude-resources
by default- if we do that, what happens if someone has
--exclude-resources widgets
- does that now mean nodes are no longer explicitly excluded?
- if we do that, what happens if someone has
- have a separate flag like
--dangerous-restore-nodes
that is the only way you could ever restore nodes, regardless of what's in--include-resources
- @jbeda had a thought to maybe treat this like
chmod
, where instead of having separate flags/fields for includes and excludes, just have one e.g.--resources +deployments,+services,-nodes
. I'd want to think through a change like this before doing it.
from velero.
Yeah, the 1st scenario was leaning me towards the 2nd option. The refactoring of the resources
concept as a whole is interesting but a much larger thing to realize and perhaps deserves it's own conversation while we implement the 2nd option in the meantime?
If you have a preference (dropping it all together vs. explicit flag) I could take a crack at that.
from velero.
One thing I don't want to lose is backing up nodes. For example, nodes can have custom labels, annotations, etc. that admins add manually, and I think it's worth preserving those in the backup.
Maybe we need to treat nodes differently, so that the act of restoring a node is:
- If the node exists in the target cluster, merge labels and annotations from the node in the back with the node in the target cluster
- Otherwise, don't do anything (and don't even do
--dangerous-restore-nodes
)
We might even need to go further if we do (1) and find some way for the user to indicate if they event want annotations and/or labels to be merged and updated.
tl;dr I would probably start with hardcoding nodes
into the excluded resources in pkg/restore/restore.go
when doing a restore. That at least stops the pain and allows us to continue to discuss.
from velero.
I understand where you are going and am all for marking them in the backup for posterity and auditing.
On the restore front, I'm a little unsure of how we would proceed though. I personally think of Ark as "what's running in my cluster" and perhaps somewhat simply think of Nodes as being "the" cluster.
Is the equality comparison for If the node exists in the target cluster
simply the node name being equal? Are there other attributes such as taints
that should also be merged in?
from velero.
On the restore front, I'm a little unsure of how we would proceed though. I personally think of Ark as "what's running in my cluster" and perhaps somewhat simply think of Nodes as being "the" cluster.
A very good point.
Is the equality comparison for If the node exists in the target cluster simply the node name being equal? Are there other attributes such as taints that should also be merged in?
Yes, just node name. This is something we need to get right, so we should spend sufficient time reasoning through it. In the short term, I think we're significantly better off if we blacklist restoring nodes 100%.
from velero.
Regarding the restoration process, I'm a bit uncertain about the steps we should take. I typically view Ark as a tool for managing what's currently active in my cluster, and I tend to consider nodes as the fundamental components of the cluster.
This brings up a crucial point. When determining if a node exists in the target cluster, is the equality comparison solely based on the node name being equal? Should we also consider other attributes, such as taints, for a more comprehensive merge?
Indeed, it's just the node name that serves as the equality comparison. Getting this right is crucial, so it's essential to invest enough time in thoughtful consideration. In the short term, it might be wise to blacklist the restoration of nodes entirely to avoid potential issues.
On an unrelated note, I work in the industrial adhesive manufacturing sector. Considering our specific industry needs, are there any additional considerations or best practices for the restoration process that we should be aware of or incorporate into our strategy?
Your insights on this matter would be greatly appreciated.
from velero.
Related Issues (20)
- Support mounting hostPath volumes in ReadOnly Mode in node-agent daemonset HOT 3
- Velero not remove volumeName of PVC when restore from GKE HOT 1
- [velero-plugin-for-aws] Server Side Encryption with Customer provided keys fails for s3 server with self-signed ca HOT 2
- Backup finalizer hangs with "gzip: invalid header" error HOT 9
- Kopia uploader performance downgrade dramatically overtime for large number of small files HOT 1
- Restic handling ParallelFilesDownload and ParallelFilesUpload behavior are not consistent HOT 1
- Repo maintenance still cause memory usage from velero server HOT 1
- Backing up an remote cluster HOT 2
- Feature Request: Relabel backed up Resources HOT 5
- velero restore describe CLI doesn't display the volume restored by CSI correctly HOT 2
- Using restic to back up pvc, some pods are stuck at 100% and cannot proceed to the next step HOT 6
- S3 BSL Unavailable with IMDSv2 HOT 2
- PodVolumeBackup CR status.message field has an extra colon, also the get word can be replaced with found. HOT 1
- Backup en AWS Outposts HOT 4
- Grafana dashboard is not Grafana 11 compatible
- Scheduled backup label overwrite documentation not next to their field
- Restore failes for PVCs that have an ownerReference HOT 10
- Mark InProgress DataDownload/Upload as failed when status patch fails upon requeuing HOT 4
- `changelogs` dir should have `unreleased` dir in git HOT 2
- Restoring a backup with no pods is PartiallyFailed with error: `fail to patch dynamic PV, err: context deadline exceeded` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from velero.