Giter VIP home page Giter VIP logo

libzfs4j's Introduction

About libzfs.jar

The libzfs.jar is a Java wrapper for native ZFS functionality as implemented by a host operating system.

If you are here because of stack traces or segmentation fault logs which mention ZFS in a Java program, please take a look at the (#Troubleshooting) section below.

A bit of turbulent history

Since the ZFS filesystem, data integrity and volume management subsystem was introduced in the Sun Solaris 10 operating system over a decade ago (in 2005), the main interaction API was to use command-line tools. Also an OS-internal native library libzfs.so was available, but its purpose was interfacing those tools to the OS kernel. In absence of other ABIs, a few projects still risked to link against this "uncommitted" library -- including the libzfs.jar wrapper employed by Jenkins continuous integration and automation server, to use on matching platforms for tasks like snapshot management, etc. and a number of other projects e.g. on GitHub

As time went and Solaris evolved, some function signatures were changed in the library, causing a moderate bit of headache for consumers like this wrapper -- but it could be handwaved by requiring an upgrade to the newest release of the single OS.

As more time had passed, the Sun OpenSolaris project and later the illumos project and numerous open-source distributions based on that had splintered off from Solaris (which, as a brand and product, went along its own proprietary path and versioning under Oracle stewardship).

Also the ZFS technologies were adopted into multiple other operating system kernels (including several BSD's, Linux and MacOS efforts), cross-pollinating and evolving under the roof of the OpenZFS project. The libzfs.so library is still an "uncommitted implementation detail" of respective distributions, with random ABI changes causing now a much bigger headache due to fragmentation -- both because there is no single ZFS-wielding OS that you can require an upgrade to, and because there are dozens of combinations of possible function signatures that can be present in a particular instance and update-level of an OS deployment.

The effect for libzfs.jar and Jenkins in particular was that as the CI application server booted up on an OS with incompatible function signatures (and began interacting with ZFS datasets), its Java process just dumped core and died -- and with evolution of OpenZFS consuming and contributing operating systems, it became more likely than not to expect a wrong single fixed ABI.

Recent (2017) changes in this project aimed to specifically improve the portability of Jenkins to illumos distros -- the "SunOS" descendants based on modern OpenZFS. Also added was testability of this wrapper's codebase via Jenkins and Travis CI as integrated on GitHub, so contributors can test their improvements before posting a pull request.

The accepted solution for improvement of libzfs.jar wrapper portability was to identify which of the wrapped routines have different native ABI signatures "present in the wild" and introduce a way to pick and call the correct signature during run-time. While the updated JAR tries its best to guess the correct set for a given host OS, it also includes a way for the end-user (sysadmin) to enforce particular implementation for each such routine as well as the overall default, using environment variables or Java properties -- and since such settings can be passed from environment outside the Jenkins web-application, they would survive eventual upgrades of the jenkins.war). Also this technique is expandable, to uniformly handle more such cases as the future comes down upon us and the libzfs.jar sources have to be updated again :)

For details about currently supported names and values for such toggles please see the source code for your version of libzfs.jar (this is at the moment regarded as "implementation detail" so options are not listed here), or refer to the current Git HEAD status:

Settings ultimately applied to each toggle can be seen in application server (or standalone Jetty app) log if you start it with a FINE or greater log4j logging level. See also the lockpick-libzfs-abi.sh script that tries out the currently known toggle options and their values, to pick the correct settings for an end-user's deployment.

From practice, for late versions of Sun Solaris and several half a decade of operating systems and ZFS modules based on illumos and OpenZFS codebases, a likely end-user setup (e.g. in application server settings) would be:

LIBZFS4J_ABI=openzfs LIBZFS4J_ABI_zfs_iter_snapshots=legacy

while for illumos-based OSes with kernel since mid-2016 it would be all-new:

LIBZFS4J_ABI=openzfs

Note that there is more work possible in this area, such as in particular expanding Jenkins ZFS support to operating systems that do not identify as a SunOS, but this improvement is out of the scope for this update (the decision is made outside libzfs.jar codebase). It could help asking the wrapper whether it can represent ZFS on the host OS, rather than guessing by some strings the OS provides, though.

At this time one can wrap calls to initialization of a LibZFS instance in caller's set-up method (rather than using a pre-initialized static final class member) and catch resulting exceptions -- this should wrap both absence of ZFS on the host OS (or other inability to use it) and the end-user's explicit request to not use the wrapper by -DLIBZFS4J_ABI=off. See LibZFSTest.java for more details.

Troubleshooting

It is likely that you've got to this repository because your Jenkins server did not start, or even had its JVM segfaulted, with hints pointing at ZFS.

It should be helpful to increase log4j level of your application server to FINE or above, so this library would report more details about its activity to help you pin-point where and why it failed.

Certain situations are known to cause issues for out-of-the-box setups:

  • Mismatch of LibZFS ABI between the native code in your operating system libraries and the Java Native Interfaces in this repository. The JVM of a Jenkins server would crash a couple of minutes after startup, producing some hs_err_pid*.log files with stack traces that mention zfs.

    • See above about setting up the ABI to use for each routine that is known to have evolved into having several binary signatures, using application server properties or environment variables. In particular, give a shot to the lockpick-libzfs-abi.sh script that tries out the currently known toggle options and their values, to pick the correct settings for an end-user's deployment. OSes and their native libzfs.so interfaces do evolve and change over time, making older "good" settings obsolete.

    • Work with your OS distribution community or vendor to pre-package a Jenkins service (or other packages using libzfs4j) that would take these settings into account.

    • If the problem is with a routine whose signature is not handled by this library, please develop, test on your OS and propose a pull request at https://github.com/kohsuke/libzfs4j to handle the new binary dialect. See existing code for examples of handling this.

  • Too many datasets (including snapshots) on the Jenkins server - long startup and/or crash due to exhausting JVM memory. Might happen during calls into native code, then producing stack traces way too similar (on the first glance) to the ABI mismatch issue. Logging at a FINE level would show that your server looks at snapshots of all ZFS datasets in order, and maybe would log some out-of-heap errors before stalling or crashing; tools like top would show the java process size growing until its configured limit.

    • Constrain the ZFS scope visible to Jenkins by running it in a zone or otherwise dedicated environment.

    • Limit the amount of snapshots made on your system, such as by an automatic snapshots service (zfs-auto-snap, time-slider, znapzend etc.).

    • As a temporary fix, try increasing the -Xmx setting of your JVM and throw lots of RAM at it.

    • Fix Jenkins-core and/or this library to not track the whole universe by default during startup - it should suffice to know just the datasets mounted under JENKINS_HOME and maybe a few other similar locations.

Kudos

  • Kohsuke Kawaguchi
  • Jim Klimov
  • Oleg Nenashev
  • Adam Stevko

libzfs4j's People

Contributors

jimklimov avatar kohsuke avatar

Stargazers

Paul Hagedorn avatar Steve Fan avatar Alex Kiraly avatar  avatar  avatar Roman Stoffel avatar  avatar Oleg Nenashev avatar  avatar Henry Widd avatar Angus H. avatar

Watchers

 avatar Oleg Nenashev avatar  avatar  avatar

libzfs4j's Issues

Enable Travis CI

@kohsuke : now that .travis.yml file is here, you as the repo owner should enable the builds.

Log in to https://travis-ci.org (there's a button to use your GitHub account as the Travis account), tell it to watch your repos, and enable the slider to build for this one. Then upon the next commit or PR, it will auto-test. Take note to give a similar service (or perhaps better - e.g. more multiplatform) with a Jenkins cloud, so as to not lose ground to competitiors :)

For details see https://docs.travis-ci.com/user/getting-started/

linkage error

Hi,

Did you have this working at all? Curious to know what os as I'm getting a linkage error when creating LibZFS on ubuntu. Any tips appreciated.

Henry

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.