Giter VIP home page Giter VIP logo

cassie's Introduction

Cassie

Code Health Test Coverage Build Status Inline Docs

Cassie provides ruby application support for Apache Cassandra. It provides components that most applications will need that are out of scope of the official cassandra-driver, including:

  • Database configuration
  • Cluster/session management
  • Cassandra server command helpers
  • Versioned schema migrations
  • Query classes and DSL
  • Test harnessing

Each of these components are designed to be used independently or together in a ruby application. If you want to manage your own configuration, use Cassie for session management, and some other gem for your queries -- great!

Tested against:

  • Ruby: MRI 2.2, 2.3, and JRuby 1.9
  • cassandra-driver 3.0

Installation

Gem Version

# Gemfile
gem 'cassie', '~> 1.1.0'

or

$ gem install cassie

See cassie-rails for Rails integration.

Database Configuration

Cassie provies database connection configuration (e.g. cluster and session) per environment. Support for a default YAML back-end is provided.

$ cassie configuration:generate

Cassie::configurations are loaded from this configuration file at runtime.

Cassie.confurations
=> {"development"=>{"hosts"=>["127.0.0.1"], "port"=>9042, "keyspace"=>"my_app_development"}, "test"=>{"hosts"=>["127.0.0.1"], "port"=>9042, "idle_timeout"=>"nil", "keyspace"=>"my_app_test"}, "production"=>{"hosts"=>["cass1.my_app.biz", "cass2.my_app.biz", "cass3.my_app.biz"], "port"=>9042, "keyspace"=>"my_app_production"}}

Cassie.configurationpulls the appropriate configuration fromCassie.configurations, based on Cassie.env`.

Cassie.env = "production"

Cassie.configuration
{"hosts"=>["cass1.my_app.biz", "cass2.my_app.biz", "cass3.my_app.biz"], "port"=>9042, "keyspace"=>"my_app_production"}

Cassie.keyspace
=> 'my_app_production'

Cassie.env prefers ENV["CASSANDRA_ENV"], then ENV["RACK_ENV"], and falls back to development.

See the Configuration README for more on features and usage.

Connection Handling

Cassie provides cluster and session connection creation according to cassie-driver best practices.

Using cached cluster and session objects

cluster and session objects are created, cached, and reused globally.

# continuing from above 'production' configuration

Cassie.cluster
=> #<Cassandra::Cluster:0x3fc032f7f9b8> #<= configured with production cluster `configuration` options

Cassie.session
=> #<Cassandra::Session:0x3fc084caa344> #<= session scoped to default 'my_app_production' keyspace

Cassie.session(nil)
=> #<Cassandra::Session:0x3fc084caba22> #<= session without scoped keyspace

Cassie.session('my_other_keyspace')
=> #<Cassandra::Session:0x3fc084cabf33> #<= session scoped to 'my_other_keyspace' keyspace

Cassie.session
=> #<Cassandra::Session:0x3fc084caa344> #<= cached session, scoped to default 'my_app_production' keyspace

If using Cassie Configuration as described above via cassandra.yml, cluster configuration happens automatically. If not, assign a cluster environments hash to Cassie::configurations before using a cluster or session.

Using cluster and session objects in Classes

Include Cassie::Connection in a class for session and keyspace functionality in your objects.

class MyQuery
  include Cassie::Connection

  # An explicit keyspace that will determine the session used
  # instead of falling back to the value in `Cassie.keyspace`
  # for all instances of this class.
  # Override `#keyspace` for per-object evaluation.
  keyspace :some_other_keyspace

  def find_user(id)
    # `session` is a vanilla Cassandra::Session
    # connected to `some_other_keyspace`
    session.execute('SELECT * FROM users WHERE id = ?;', arguments: [id])
  end
end

See the Connection README for more on features and usage.

Cassandra Control

Cassie provides simple commands to control Cassandra execution in *nix development. These simplify execution and reduce output to provide faster management of your Cassandra processes.

Start

$ cassie start
Starting Cassandra...
[✓] Cassandra Running

Stop

$ cassie stop
Stopping Cassandra...
[✓] Cassandra Stopped
$ cassie stop
Couldn't single out a Cassandra process.
  - Is cqlsh running?
  - Kill all cassandra processes with --all
    - 9542  | /usr/local/apache-cassandra-3.0.8/bin/cqlsh.py
    - 2832  | org.apache.cassandra.service.CassandraDaemon

$ cassie stop --all
Stopping Cassandra...
[✓] Cassandra Stopped

Restart

$ cassie restart
Stopping Cassandra...
[✓] Cassandra Stopped
Starting Cassandra...
[✓] Cassandra Running

Tail

$ cassie tail
Tailing Cassandra system log, Ctrl-C to stop...
  /usr/local/cassandra/logs/system.log:

INFO  [main] 2016-09-23 11:18:05,073 StorageService.java:1902 - Node localhost/127.0.0.1 state jump to NORMAL
INFO  [main] 2016-09-23 11:18:05,215 NativeTransportService.java:75 - Netty using Java NIO event loop
INFO  [main] 2016-09-23 11:18:05,343 Server.java:159 - Using Netty Version: [netty-buffer=netty-buffer-4.0.23.Final.208198c, netty-codec=netty-codec-4.0.23.Final.208198c, netty-codec-http=netty-codec-http-4.0.23.Final.208198c, netty-codec-socks=netty-codec-socks-4.0.23.Final.208198c, netty-common=netty-common-4.0.23.Final.208198c, netty-handler=netty-handler-4.0.23.Final.208198c, netty-transport=netty-transport-4.0.23.Final.208198c, netty-transport-rxtx=netty-transport-rxtx-4.0.23.Final.208198c, netty-transport-sctp=netty-transport-sctp-4.0.23.Final.208198c, netty-transport-udt=netty-transport-udt-4.0.23.Final.208198c]
INFO  [main] 2016-09-23 11:18:05,344 Server.java:160 - Starting listening for CQL clients on localhost/127.0.0.1:9042 (unencrypted)...
INFO  [main] 2016-09-23 11:18:05,407 CassandraDaemon.java:477 - Not starting RPC server as requested. Use JMX (StorageService->startRPCServer()) or nodetool (enablethrift) to start it

Versioned Schema Migrations

Cassie supports migration between schema states using semantically versioned, incremental migration files.

Schema Version information is stored in Cassandra persistence, in the cassie_schema.versions table (configurable).

A schema file holds the current state of the schema in-repo, at db/cassandra/schema.rb, (configurable).

Various cassie <task> tasks are used to manage the schema version and migrations.

Tasks

Task Description
migrations:import Import existing cassandra_migrations migration files and convert to semantic versioning
migration:create Generates an empty migration file prefixed with the next semantic version number
migrate Migrates the schema by running the up methods in any migrations starting after the current schema version
migrate:reset runs schema:reset and migrate
schema:init Create versioned migrations schema, and the environment's keyspace if it doesn't exist
schema:version Print the current schema version information for the Cassandra cluster
schema:history Print the the historical version information the current Cassandra cluster state
schema:status Print the the migration status for each local migration (up/down)
schema:load Creates the schema by executing the CQL schema in the schema file (db/cassandra/schema.rb by default)
schema:drop drop keyspace(s)
schema:dump Dumps the schema for all non-system keyspaces in CQL format (db/cassandra/schema.rb by default)
schema:reset runs schema:drop and schema:load
schema:import Create an initial migration based on the current Cassandra non-system schema

See the Migrations README for more on features and usage.

Query DSL

Cassie provides base Query Classes to manage interactions to the database. Create application specific subclasses and construct queries with a simple CQL DSL.

class UserByUsernameQuery < Cassie::Query

  select_from :users_by_username

  where :username, :eq

  consistency :quorum
end
UserByUsernameQuery.new.fetch_first(username: "eprothro")
=> #<Struct user_id=123, username="eprothro">

See the Queries README for more on features and usage.

Test Harnessing

Avoid making queries into the persistnace layer when you can afford it.

some_query = SomeQuery.new
some_query.extend(Cassie::Testing::Fake::Query)
some_query.session.rows = [{'user_id' => 123, 'username' => 'eprothro'}]

some_query.fetch
=> [#<Struct user_id=123, username="eprothro">]

some_query.session.last_statement
=> #<Cassandra::Statements::Simple:0x3ffde09930b8 @cql="SELECT * FROM users LIMIT 500;" @params=[]>

See the Testing README for more on features and usage.

Contributing

Pull requests and issues are welcome. Please read the contributing guidelines.

cassie's People

Contributors

eprothro avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

cassie's Issues

schema:init - Doesn't support multiple DC out of the box

cassie schema management keyspace is created with simple replication strategy, if you have a cluster with a multiple DC configuration, the cassie schema tasks will fail.

Workaround:

Attempt to initialize

cassie schema:init

Alter the keyspace manually

Edit the cassie_schema keyspace manually (in cqlsh) to have the proper replication settings.

initialize again

cassie schema:init

add support for UDTs to `to_cql`

to_cql falls over with certain UDTs with exception message like

ArgumentError:
       value must be an instance of Time, ["foo_at", 2017-10-31 12:10:00 -0500] given

Add rspec matchers for cassandra statements

expect(query.statement).to have_cql_assignment(:field)
expect(query.statement).to have_cql_assignment(:field, value)
expect(query.statement).to have_cql_relation(:field)
expect(query.statement).to have_cql_relation(:field, value)

Would go a long way for clients to be able to stop their testing plane sooner and have faster cassandra query coverage.

Make Query#to_cql handle invalid values more clearly

type
#=> #<Cassandra::Types::Simple:0x007fed37c33300 @kind=:bigint>

params[i]
#=> nil

params
#=> ["55e05d24-513b-11e7-b6b8-33c23e6e0000", 2017-06-14 14:54:54 -0500, nil, nil, "55e0893e-513b-11e7-bd08-0f7a965973cc", "testuser94517816"]

bound_cql
#=> "INSERT INTO some_table (id, other_id, url, other_url, user_id, username) VALUES (?, ?, ?, ?, ?, ?);"

type
#<Cassandra::Types::Simple:0x007fed37c33300 @kind=:bigint>

type.new(nil)
     TypeError:
       can't convert nil into Integer

would be nice to catch type errors and give more info about what param is the culprit.

Allow interoperability for fake session and real session prepared statements

preparing a statement through the fake session results in a fake prepared statement being put in the statement cache.

If that fake prepared statement is then executed later with a real session, there can be type hinting issues like

Cassandra::Errors::InvalidError:
       Expected 4 or 0 byte int (8)

Only affects test/spec query usage.

tasks that drop schema often time out

Cassandra::Errors::TimeoutError for non-trivial schemas when running:

  • schmea:drop
  • schema:reset
  • migrate:reset

Cassandra received request and executes successfully, but times out while doing so, so rest of task doesn't run after that failure.

In all cases there should be no harm in simply re-running.

However, probably a good idea to set the query timeout longer for these queries.

fetch_first doesn't work when cursoring is enabled

If cursor_by has been called on a cassie query class, fetch_first will fail, with:

A singleton method has already been defined for `limit`. `with_limit` can't be implemented.
     # ./lib/cassie/statements/statement/limiting.rb:89:in `assert_no_limit_singleton'
     # ./lib/cassie/statements/statement/limiting.rb:70:in `define_limit_singleton'
     # ./lib/cassie/statements/statement/limiting.rb:46:in `with_limit'
     # ./lib/cassie/statements/execution/peeking.rb:12:in `execute'
     # ./lib/cassie/statements/execution/fetching.rb:25:in `fetch'
     # ./lib/cassie/statements/execution/fetching.rb:38:in `block in fetch_first'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.