Comments (6)
Generally speaking, I suggest being careful to resist the temptation to over-engineer features. Rather, design and implmenet what you know is needed and then see how that goes and whether there is demand for more or something different.
Regarding this feature specifically, speaking only for our own use case, we do not have need for both detailed "off-time" and "on-time" specifications. Our team has typical work hours and has an on-call rotation for non-working hours. I imagine that would generally describe the majority of the chaoskube users. Since chaoskube is (from our perspective) intended to be run as an on-going stabiliity test, all we care about is being able to limit which services are impacted, and not making on-call life harder on anyone unnecessarily. You may notice that chaosmonkey does not provide such detailed scheduling, AFAIK. If someone wants to run chaoskube on the weekend, they can just deploy another instance of it to do whatever they want. The scheduling will never be perfect anyway, since the holidays will need to be updated from time to time, at least. Finally, we view chaoskube as a tool that gives us confidence in the resilience of our systems, but it is not critical to our infrastructure and does not need precise scheduling capabilities.
Another reason to avoid precise scheduling capability is that it is significantly more difficult to implement correctly. You will have to include all kinds of logic to handle periods that span midnight and Daylight Saving jumps. And you will have to try to find a way to support such configuration that is not confusing. People will get confused about what their configuration really means, no matter how carefully you write your documentation, and then you will get all kinds of bug reports that are actually user-error or user misunderstanding.
You could, perhaps, if the need was shown to be significant, add the ability to override each global "off-time" attribute with service-specific ones through annotations. But I would wait and see if this is a real need, because it adds complexity. Our team does not need this.
from chaoskube.
I did some of the things you proposed in my PR already. If you want extend it with the things I don't have that'd be the easiest for you.
from chaoskube.
@twildeboer @klautcomputing I try to look at the PR again over the weekend. At first sight the way to specify the time frame as well as the implementation seemed quite complicated to me.
@klautcomputing would you think defining the range similar to https://github.com/hjacobs/kube-downscaler#configuration would simplify usage as well as implementation and still be able to capture your use cases?
e.g.: be active at work time as well as midday on weekends would be:
--active-at "Fri-Fri 10:00-16:00 CET, Sat-Sun 10:00-12:00 CET"
from chaoskube.
@linki could you leave a couple of comments on my code where you think my implementation is too complicated?
--active-at "Fri-Fri 10:00-16:00 CET, Sat-Sun 10:00-12:00 CET"
Did you maybe mean Mon-Fri
? Because otherwise I don't see how that format is meant to work. If yes, then I think that'd be easily doable. Thinking about it again we might not want this as a flag for choaskube but instead as a label in the manifest which would allow individual teams to specify their own schedule.
This raises the general question of whether we want chaoskube to be purely opt in. Given that chaos engineering is not something that should surprise a team, but they should have made an active decision to test their systems with chaos it might be the right choice and would get rid of --percentage
in my code and make it a little easier.
from chaoskube.
@linki - PR for this feature waiting for you.
from chaoskube.
@klautcomputing @twildeboer Thank you for all your input.
The above feature is part of v0.8.0 so I'm going to close this issue. I think we found a fairly easy way to configure it althought the equivalent of --workhours
is not defined including but excluding similar to --offdays
and --holidays
.
I also think that at some point some configuration should be overridable by annotations or moved entirely to annotations, e.g. for users defining a mean-time-to-failure
on a per-pod basis and independent of the cluster size (the "percentage" feature, #20).
from chaoskube.
Related Issues (20)
- Node termination support?
- Add support for terminating multiple pods within a topology
- kinds argument available? HOT 1
- Add flag to stop chaoskube after a given time HOT 1
- Helm chart on Kubeapps is not available HOT 18
- Action Required: Fix Renovate Configuration
- Context deadline exceeded while using slack notifier HOT 9
- maxKill functionality not working using helm chart HOT 1
- Makefile fails HOT 1
- Pod termination timestamp HOT 1
- Dependency Dashboard
- Bug with gracePeriod conversion HOT 3
- [FEATURE] Kill only pods which do have more then 1 sibling replica HOT 2
- Cannot disable dry run HOT 4
- Envar support HOT 3
- Switch to multi-arch docker image HOT 2
- Chaoskube does not kill pods and cannot target the chaoskube to specific namespaces HOT 1
- Failed to start HTTP server HOT 2
- [bug or feature?] pod being killed continuously HOT 6
- Log output error: json: error calling MarshalJSON for type time.Time HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chaoskube.