Giter VIP home page Giter VIP logo

solid_cache's Introduction

Solid Cache

Solid Cache is a database-backed Active Support cache store implementation.

Using SQL databases backed by SSDs we can have caches that are much larger and cheaper than traditional memory only Redis or Memcached backed caches.

Testing on HEY shows that reads and writes are 25%-50% slower than with a Redis cache (1.2ms vs 0.8-1ms per single-key read), but this is not a significant percentage of the overall request time.

If cache misses are expensive (up to 50x the cost of a hit on HEY), then there are big advantages to caches that can hold months rather than days of data.

Usage

To set Solid Cache as your Rails cache, you should add this to your environment config:

config.cache_store = :solid_cache_store

Solid Cache is a FIFO (first in, first out) cache. While this is not as efficient as an LRU cache, this is mitigated by the longer cache lifespans.

A FIFO cache is much easier to manage:

  1. We don't need to track when items are read
  2. We can estimate and control the cache size by comparing the maximum and minimum IDs.
  3. By deleting from one end of the table and adding at the other end we can avoid fragmentation (on MySQL at least).

Installation

Add this line to your application's Gemfile:

gem "solid_cache"

And then execute:

$ bundle

Or install it yourself as:

$ gem install solid_cache

Add the migration to your app:

$ bin/rails solid_cache:install:migrations

Then run it:

$ bin/rails db:migrate

Configuration

Engine configuration

There are two options that can be set on the engine:

  • executor - the Rails executor used to wrap asynchronous operations, defaults to the app executor
  • connects_to - a custom connects to value for the abstract SolidCache::Record active record model. Required for sharding and/or using a separate cache database to the main app.

These can be set in your Rails configuration:

Rails.application.configure do
  config.solid_cache.connects_to = {
    shards: {
      shard1: { writing: :cache_primary_shard1 },
      shard2: { writing: :cache_primary_shard2 }
    }
  }
end

Cache configuration

Solid Cache supports these options in addition to the standard ActiveSupport::Cache::Store options.

  • error_handler - a Proc to call to handle any ActiveRecord::ActiveRecordErrors that are raises (default: log errors as warnings)
  • expiry_batch_size - the batch size to use when deleting old records (default: 100)
  • expiry_method - what expiry method to use thread or job (default: thread)
  • max_age - the maximum age of entries in the cache (default: 2.weeks.to_i)
  • max_entries - the maximum number of entries allowed in the cache (default: nil, meaning no limit)
  • cluster - a Hash of options for the cache database cluster, e.g { shards: [:database1, :database2, :database3] }
  • clusters - and Array of Hashes for multiple cache clusters (ignored if :cluster is set)
  • active_record_instrumentation - whether to instrument the cache's queries (default: true)
  • clear_with - clear the cache with :truncate or :delete (default truncate, except for when Rails.env.test? then delete)
  • max_key_bytesize - the maximum size of a normalized key in bytes (default 1024)

For more information on cache clusters see Sharding the cache

Cache expiry

Solid Cache tracks writes to the cache. For every write it increments a counter by 1. Once the counter reaches 80% of the expiry_batch_size it adds a task to run on a background thread. That task will:

  1. Check if we have exceeded the max_entries value (if set) by subtracting the max and min IDs from the SolidCache::Entry table (this is an estimate that ignores any gaps).
  2. If we have it will delete expiry_batch_size entries
  3. If not it will delete up to expiry_batch_size entries, provided they are all older than max_age.

Expiring when we reach 80% of the batch size allows us to expire records from the cache faster than we write to it when we need to reduce the cache size.

Only triggering expiry when we write means that the if the cache is idle, the background thread is also idle.

If you want the cache expiry to be run in a background job instead of a thread, you can set expiry_method to :job. This will enqueue a SolidCache::ExpiryJob.

Using a dedicated cache database

Add database configuration to database.yml, e.g.:

development:
  cache:
    database: cache_development
    host: 127.0.0.1
    migrations_paths: "db/cache/migrate"

Create database:

$ bin/rails db:create

Install migrations:

$ bin/rails solid_cache:install:migrations

Move migrations to custom migrations folder:

$ mkdir -p db/cache/migrate
$ mv db/migrate/*.solid_cache.rb db/cache/migrate

Set the engine configuration to point to the new database:

Rails.application.configure do
  config.solid_cache.connects_to = { database: { writing: :cache } }
end

Run migrations:

$ bin/rails db:migrate

Sharding the cache

Solid Cache uses the Maglev consistent hashing scheme to shard the cache across multiple databases.

To shard:

  1. Add the configuration for the database shards to database.yml
  2. Configure the shards via config.solid_cache.connects_to
  3. Pass the shards for the cache to use via the cluster option

For example:

# config/database.yml
production:
  cache_shard1:
    database: cache1_production
    host: cache1-db
  cache_shard2:
    database: cache2_production
    host: cache2-db
  cache_shard3:
    database: cache3_production
    host: cache3-db
# config/environment/production.rb
Rails.application.configure do
  config.solid_cache.connects_to = {
    shards: {
      cache_shard1: { writing: :cache_shard1 },
      cache_shard2: { writing: :cache_shard2 },
      cache_shard3: { writing: :cache_shard3 },
    }
  }

  config.cache_store = [ :solid_cache_store, cluster: { shards: [ :cache_shard1, :cache_shard2, :cache_shard3 ] } ]
end

Secondary cache clusters

You can add secondary cache clusters. Reads will only be sent to the primary cluster (i.e. the first one listed).

Writes will go to all clusters. The writes to the primary cluster are synchronous, but asyncronous to the secondary clusters.

To specific multiple clusters you can do:

Rails.application.configure do
  config.solid_cache.connects_to = {
    shards: {
      cache_primary_shard1: { writing: :cache_primary_shard1 },
      cache_primary_shard2: { writing: :cache_primary_shard2 },
      cache_secondary_shard1: { writing: :cache_secondary_shard1 },
      cache_secondary_shard2: { writing: :cache_secondary_shard2 },
    }
  }

  primary_cluster = { shards: [ :cache_primary_shard1, :cache_primary_shard2 ] }
  secondary_cluster = { shards: [ :cache_secondary_shard1, :cache_secondary_shard2 ] }
  config.cache_store = [ :solid_cache_store, clusters: [ primary_cluster, secondary_cluster ] ]
end

Named shard destinations

By default, the node key used for sharding is the name of the database in database.yml.

It is possible to add names for the shards in the cluster config. This will allow you to shuffle or remove shards without breaking consistent hashing.

Rails.application.configure do
  config.solid_cache.connects_to = {
    shards: {
      cache_primary_shard1: { writing: :cache_primary_shard1 },
      cache_primary_shard2: { writing: :cache_primary_shard2 },
      cache_secondary_shard1: { writing: :cache_secondary_shard1 },
      cache_secondary_shard2: { writing: :cache_secondary_shard2 },
    }
  }

  primary_cluster = { shards: { cache_primary_shard1: :node1, cache_primary_shard2: :node2 } }
  secondary_cluster = { shards: { cache_primary_shard1: :node3, cache_primary_shard2: :node4 } }
  config.cache_store = [ :solid_cache_store, clusters: [ primary_cluster, secondary_cluster ] ]
end

Enabling encryption

Add this to an initializer:

ActiveSupport.on_load(:solid_cache_entry) do
  encrypts :value
end

Index size limits

The Solid Cache migrations try to create an index with 1024 byte entries. If that is too big for your database, you should:

  1. Edit the index size in the migration
  2. Set max_key_bytesize on your cache to the new value

Development

Run the tests with bin/rails test. By default, these will run against SQLite.

You can also run the tests against MySQL and PostgreSQL. First start up the databases:

$ docker compose up -d

Next, setup the database schema:

$ TARGET_DB=mysql bin/rails db:setup
$ TARGET_DB=postgres bin/rails db:setup

Then run the tests for the target database:

$ TARGET_DB=mysql bin/rails test
$ TARGET_DB=postgres bin/rails test

Testing with multiple Rails version

Solid Cache relies on appraisal to test multiple Rails version.

To run a test for a specific version run:

bundle exec appraisal rails-7-1 bin/rails test

After updating the dependencies in then Gemfile please run:

$ bundle
$ appraisal update

This ensures that all the Rails versions dependencies are updated.

License

Solid Cache is licensed under MIT.

solid_cache's People

Contributors

djmb avatar skatkov avatar dependabot[bot] avatar intrip avatar dhh avatar simi avatar northeastprince avatar robbyrussell avatar gagalago avatar seuros avatar brunoprietog avatar esasse avatar jonathanhefner avatar jordelver avatar xuanxu avatar codergeek121 avatar olimart avatar rafaelsales avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.