Giter VIP home page Giter VIP logo

pech's Introduction

I found more comfortable to hack Ceph, analysing protocol
implementation, monitor and OSD client code reading linux kernel
sources, instead of legacy OSD C++ code or (god forbid!)  Crimson
project.

Here, in the sources, the level of craziness is above all permissible
norms: I took (almost) all Ceph sources from net/ceph/ kernel path and
build them in userspace as a simplest OSD possible which has a perfect
name `Pech` (Germans speakers will understand).

I do not use threads, I use cooperative scheduling and jump from task
contexts using setjmp()/longjmp() calls. This model perfectly fits UP
kernel with disabled preemption, thus adapted workqueue.c and timer.c
code runs the event loop.

So again, no atomic operations, no locks, everything is one thread.
In future number of event loops can be equal to a number of physical
CPUs, where each event loop is executed from a dedicated pthread
context and pinned to a particular CPU.

What Pech OSD does?

  o Connects to monitors and "boots" OSD, i.e. marks it as UP.
  o On Ctrl+C marks OSD on monitors as DOWN and gracefully exits.
  o OSD operations supported in memory:

     - OP_WRITE
     - OP_WRITEFULL
     - OP_READ
     - OP_SYNC_READ
     - OP_STAT
     - OP_CALL
     - OP_OMAPGETVALS
     - OP_OMAPGETVALSBYKEYS
     - OP_OMAPSETVALS
     - OP_OMAPGETKEYS
     - OP_GETXATTR
     - OP_SETXATTR
     - OP_CREATE

  So simple fio/examples/rados.fio load can be run.

  For RBD images (e.g. fio/examples/rbd.fio) OSD class directory should
  be specified, i.e. where is your upstream Ceph built is located:
  `class_dir=$CEPH/build/lib`.

  The thing is that RBD images require loading of OSD object classes,
  which are shared objects and loaded by the OSD on OP_CALL request.
  The original Ceph OSD object classes can't be simply loaded from
  Pech, because it is written on C and does not provide any C++
  interfaces or Ceph common libraries, like for example bufferlist.
  In order to provide missing C++ classes and C++ interface we need a
  proxy library, which acts as a bridge between Pech OSD and OSD
  object classes.  The Pech proxy library can be found here [2], so
  `make ceph_pech_proxy` should be called in order to build it. When
  everything is built and `class_dir` is specified rbd.fio load should
  work just fine.

What is not yet ported from kernel sources?

  o Crypto part is noop, thus monitors should be run with auth=none.
    To make cephx work direct copy-paste of kernel crypto sources
    has to be done, or a wrapper over openssl lib should be written,
    see src/ceph/crypto.c interface stubs for details.

What is the Great Idea behind?

  o I need easy hackable OSD in C with IO path only, without failover,
    log-based replication, PG layer and all other things. I want to
    test different replication strategies (client-based, primary-copy,
    chain) having simplest and fastest file storage (yes, step back to
    filestore) which reads and writes directly to files.

    Eventually this Pech OSD can be a starting point to something
    different, something which is not RADOS, which is fast, with
    minimum IO ordering requirements and acts as a RAID 1 cluster,
    e.g. something which is described here [1].

Make:

  $ make -j8

Start new Ceph cluster with 1 OSD and then stop everything.  We start
monitors on specified port, proto v1 and with -X option, i.e. auth=none.

  $ CEPH_PORT=50000 MON=1 MDS=0 OSD=1 MGR=0 ../src/vstart.sh --memstore \
    -n -X --msgr1
  $ ../src/stop.sh

Restart only Ceph monitor(s):

  $ MON=1 MDS=0 OSD=0 MGR=0 ../src/vstart.sh

Start pech-osd accessing monitor over v1 protocol:

  $ ./pech-osd mon_addrs=ip.ip.ip.ip:50001 name=0 fsid=`cat ./osd0/fsid` \
    log_level=5

For the case when RBD images are required Pech proxy should be built
and `class_dir` should be specified:

  $ ./pech-osd mon_addrs=ip.ip.ip.ip:50001 name=0 fsid=`cat ./osd0/fsid` \
    class_dir=$CEPH/build/lib log_level=5

For DEBUG purposes maximum output log level can be specified: log_level=7

Have fun!

--
Roman

[1] https://lists.ceph.io/hyperkitty/list/[email protected]/thread/N46NR7NBHWBQL4B2ASU7Y2LMKZZPK3IX/
[2] https://github.com/rouming/ceph/tree/pech-osd

pech's People

Contributors

rouming avatar idryomov avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.