tumblr / genesis Goto Github PK
View Code? Open in Web Editor NEWA tool for data center automation
Home Page: http://tumblr.github.io/genesis/
License: Apache License 2.0
A tool for data center automation
Home Page: http://tumblr.github.io/genesis/
License: Apache License 2.0
A clear, concise explanation for what genesis is, what it is used for, and what the respective components are is needed as the first section in the README.
It would be really helpful to be able to access the Genesis environment over IPMI Serial over Lan rather than needing full graphical console access.
If you take a look here: https://github.com/tumblr/genesis/blob/master/testenv/bootbox/web/config/config.yml
There's a bunch of URLs and filenames which are Tumblr specific.
As the Virident setup tasks aren't included in the public repo, not sure why the config entries for them were even left in.
Also, while not inherently "bad" per say, probably not wise exposing nexus / AAA / Collins / mrepo internal Tumblr host names and file system structures either.
The build process (sometimes??) leaves dangling /dev/loopN devices.
My Makefile contains shenanigans for cleaning those up which are relatively simple under Linux.
I imagine that on a Mac, the Docker Linux VM could get cluttered up in a way that is even harder to clean up.
It would be better if the build process just cleaned up after itself.
I think the culprit is the livecd-iso-to-pxeboot
tool.
Assuming that it won't be fixed any time soon, it is possible that after create-image.sh#L78 we could add:
losetup -a | grep /genesis/bootcd/genesis.iso | cut -d: -f1 | xargs -r -n 1 losetup -d
Further testing of this idea is needed.
Config currently specified the following properties
:ipxe:
:images:
:initrd: <%= @ipxe_initrd %>
:kernel: <%= @ipxe_kernel %>
:kernel_flags: <%= @ipxe_kernel_flags %>
:menu_default: intake
:image_server: http://<%= @image_service %>/genesis
:genesis_server: http://<%= @genesis_service %>
I don't believe these are used by genesis itself. Speaking with @roymarantz, I understand these were moved to configs in PHIL. Please verify and get rid of these
Looking at https://github.com/tumblr/genesis/blob/master/testenv/bootbox/web/config/config.yml.sample
I think the framework should separate the configuration directives necessary for the framework to function (read: ipxe, genesis_server, gem, collins, ntp server) from the specific configuration directives that are consumed by tasks (i.e. virident, bios_settings, verify_tool). Each task may or may not rely on some specific configuration (i.e. configure_virident wants to know what firmware tarballs there are, or what version to install), and can ask for that specific config before running. I want to avoid having a single large yaml config that has every setting for every task. It becomes difficult to identify which task depends on which config directives (as we have seen with visioner).
I would be happy to start breaking this apart and writing a config fetch mechanism that runs before each task. @maddalab @roymarantz @defect
Related to tumblr/collins#550, CPU speed is incorrectly captured during induction as a result of CPU frequency scaling. When lshw is executed (as part of adding a new node into collins), the CPU speed can vary depending on the load of the server.
Ideally CPU scaling would be disabled and lshw would report the correct / maximum frequency of the processor.
I'm sure anyone who hasn't worked at Tumblr on the SRE team would likely struggle a bit trying to actually get this going in production. Maybe a guide that instructs what and how to set this up and have it be used in an existing Collins environment and/or any environment?
Has there been any attempt to get UEFI support working in the genesis boot image? Given that new hardware these days seems to come with UEFI enabled out of the box it'd be nice to use that and not have to manually change each server. If it has not been attempted I will try and make it work and PR it.
I was thinking about the pain around configuring genesis images, and it struck me that we could completely remove any configuration necessary to make a live image by moving genesis-specific setup into an at-runtime configuration like cloud-init
. Instead of running scripts to install ruby/genesis gems, configure specific things about the bootable image, we could just distribute a cloud-init yaml that sets up genesis at boot.
That way, users could just boot any live image they want (lets say centos7) and tweak the cloud-init config as they see fit (perhaps they want ruby 1.9, or some extra packages installed). This would provide a very clean method for downstreams to configure genesis to do whatever they want, like adding ssh authorized keys, changing root pw, installing extra software, fetching configs, adding users, etc without needing to rebuild and deploy a new image.
Each release should include a bootable .iso
, vmlinuz
, and initrd
for each supported genesis OS version. This would make it much easier for downstream users to get started, and seeing as the bootable image is basically static, is removing a huge speedbump from every potential user.
When trying to set up Genesis from scratch, i run into and issue where the genesis bootloader cannot load retryingfetcher or promptcli. LoadError: cannot load such file -- 'retryingfetcher'
(@maddalab: Is this what you saw as well?)
At first I thought it was DNS messing up the installation of this gem somehow, but both of these are installed when creating the image with create-image.sh. See genesis.ks L221 and genesis.ks L250.
So, in the testenv, kickstart fetches the gems through the web app (from static files in /genesis/src/*/), stores them in /root/repo/gems
and then installs them from there. The problems i see here are:
I'll get a branch going with some README updates and leave the floor open for discussion about the process.
https://github.com/tumblr/genesis/blob/master/src/framework/lib/genesisframework/task.rb#L98-L100 does not support an optional level parameter. This may be useful for things like logging to elasticsearch or collins, so one may optionally specify the level of the log message from the task. As it exists now, all log messages in tasks (i.e. using https://github.com/tumblr/genesis/blob/master/tasks/modules/logging/collins.rb) will appear the same level, making it very confusing to debug error messages at level "info".
Ideally, the task DSL would allow for: log "Running a thing"
as well as log "shit, something exploded", :error
.
When a task fails, no backtrace/line number is printed. This makes debugging failures in tasks kind of an annoying task. For example:
ProvisionFormatDisks :: run #3 caused error: undefined local variable or method `n' for Genesis::Framework::Tasks::ProvisionFormatDisks:Class
Would be useful for the runner to print the backtrace, filename, and line number for the exception, so we can at least see what happened in the genesis logs.
You can close this out but just wanted to let you know the following if it's helpful at all. The collins key in the latest version that was unpublished went from accessing collins as a string to accessing it as a symbol. Additionally the log function has an implicit interface that went from passing in one parameter to passing in two.
These seem to be the two significant changes.
https://github.com/tumblr/genesis/blob/master/src/framework/lib/genesisframework/utils.rb#L35
https://github.com/tumblr/genesis/blob/master/src/framework/lib/genesisframework/utils.rb#L82
Would you mind documenting or at least giving a hint on how you go from Genesis (and intake/burn in) to provisioning a host? If PXE boot is pointed at Genesis, I don't see how you can provision a host with kickstart for instance. Are you doing something within Genesis to do this process?
We've been using the SL6-based genesis image for years (later adding the kernel-ml
package from ELRepo to get newer hardware support). Looks like it's the end of the line for ELRepo supporting EL6, and we are now starting to face hardware support issues, especially with new LSI raid cards. Has there been any attempt to use a newer release of a RHEL derivative? A few months back I tried to build a release with Rocky 8 and ran into some issues - I'm planning on picking it up again with Rocky 9 now that that's out, but I figured I should at least ask here if this is something you've considered since I assume you also have need for newer hardware support. Have you made any attempts to get a newer version of centos/rocky/almalinux working with genesis?
If you update the instructions for the Bootbox's Vagrant to use the vagrant-vbguest plugin, then it does't matter which version of VirtualBox is installed as the VM image will be upgraded automatically with the latest virtual box tools if needed.
I was trying to play around with the test env provided via vagrant in the repository and it seems to be missing. https://www.dropbox.com/s/pvydgdcnf4o00im/sl-base-v4.3.10.box?dl=1 404s for me. Do you happen to have a new location I can pull this image from?
The "general workflow" part of the readme doesn't align the the features actually released. The iPXE network booting components require Phil or the code I had written previously which combined that part of Phil into the Genesis web server. I'd recommend either scratching that part from the documentation or even better ideally actually open sourcing the rest of Genesis/Phil which includes those portions. :)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.