Giter VIP home page Giter VIP logo

wt_activerecord_index_spy's Introduction

wt_activerecord_index_spy

Gem GitHub Actions Workflow Hippocratic License

A Ruby library to watch and analyze queries that run using ActiveRecord to check if they use a proper index.

It subscribes to sql.active_record notification using ActiveSupport::Notifications.

It was designed to be used in tests, but it's also possible to use it in staging or production, carefully.

Why would I use this?

Imagine you have an application running in production and after a deploy, it starts to slow down.

After a perhaps exhaustive debugging session, you may find that a new query or perhaps a change in the database schema was responsible for starting to have queries that do not use database indexes.

Then, you create the appropriate index and the problem is solved!

By using this gem, you can get those queries that are not using suitable database indexes in your test suite. So you won't have surprises like the example above, after a deploy.

You can also enable the gem in your development/staging environment, generate a report and analyze if there is any missing index.

Installation

Add this line to your application's Gemfile:

gem 'wt_activerecord_index_spy'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install wt_activerecord_index_spy

Usage

There are 3 different modes to use it:

1 - Using a test matcher

Include the helper in your RSpec configuration:

require 'wt_activerecord_index_spy/test_helpers'

RSpec.configure do |config|
  config.include(WtActiverecordIndexSpy::TestHelpers)
end

Use the helper have_used_db_indexes where you want to check if all queries used database indexes:

it 'uses an index for all the queries' do
  expect { SomeClass.some_method }.to have_used_db_indexes
end

Given that some results are uncertain, it's also possible to set the matcher to fail only with certain results:

it 'uses an index for all the queries' do
  expect { SomeClass.some_method }.to have_used_db_indexes(only_certains: true)
end

Run for all rspec tests

By adding the following to your rspec configuration the have_used_db_indexes will run on each individual test and error if an index has not been used:

Rspec.configure do |config|
  config.around(:each) do |example|
    unless example.metadata[:skip_index_spy]
      expect { example.run }.to(have_used_db_indexes)
    else
      example.run
    end
  end

  config.after(:all) do
    WtActiverecordIndexSpy.export_html_results
  end
end

If you wish to skip index checking for specific tests you can then annotate your test as follows:

describe 'Will not check indexes', :skip_index_spy do
# ...or...
context 'Does not check indexes', :skip_index_spy do
# ...or...
it 'will not check indexes', :skip_index_spy do

2 - Watching all queries from a start point

Add this line to enable it:

WtActiverecordIndexSpy.watch_queries

After that, wt_activerecord_index_spy will run an EXPLAIN query for every query fired with ActiveRecord that has a WHERE condition.

Finally, you can generate a report with the results:

WtActiverecordIndexSpy.export_html_results

This method creates an HTML file with a report and prints its location to STDOUT.

The content of this file is similar to this:

Level Identifier Query Origin
certain User Load SELECT users.* FROM users WHERE users.name = 'lala' LIMIT 1 spec/wt_activerecord_index_spy_spec.rb:162
uncertain User Load SELECT users.* FROM users WHERE users.city_id IN (SELECT cities.id FROM cities WHERE cities.name = 'Santo Andre') spec/wt_activerecord_index_spy_spec.rb:173

Where:

  • Level: certain when it is certain that an index is missing, or uncertain when it's not possible to be sure
  • Identifier: is the query identifier reported ActiveRecord notification
  • Origin: is the line the query was fired

This mode, by default, ignores queries that were originated in test code. For that, it considers files which path includes test/ or spec/.

It's possible to disable it as follows:

WtActiverecordIndexSpy.watch_queries(ignore_queries_originated_in_test_code: false)

If the same query runs in many places, only one origin will be added to the report.

It's also possible to get the results using the following methods:

# Returns a list of certain results
WtActiverecordIndexSpy.certain_results

# Returns a list of certain and uncertain results mixed
WtActiverecordIndexSpy.results

3 - Watching all queries given a block

It's also possible to enable it in a specific context, using a block:

WtActiverecordIndexSpy.watch_queries do
  # some code...
end

After that, you can generate the HTML report with:

WtActiverecordIndexSpy.export_html_results

Supported versions

Currently, it supports:

Ruby: 3.0, 2.7, 2.6, 2.5

Mysql: 5.7

PostgreSQL: 13.2

Note: Currently, the PostgreSQL query analyser is not so intelligent and can't be certain if an index is missing or not. So all results are uncertain. More details in #12

ActiveRecord: 6, 5, 4

RSpec: 3.x

Development

After checking out the repo, run cp .env.template .env and set database credentials to run the tests.

Run bin/setup to install dependencies.

Then, run bundle exec rake db:create db:migrate to create the Mysql databases and ADAPTER=postgresql bundle exec rake db:create db:migrate to create the PostgresSQL database.

Then, run bundle exec rspec to run the tests for Mysql and ADAPTER=postgresql bundle exec rspec.

You can also run bundle exec bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and the created tag, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/wetransfer/wt_activerecord_index_spy. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct. Please add your name to the CONTRIBUTORS.md

License

The gem is available as open source under the terms of the Hippocratic License.

Code of Conduct

Everyone interacting in the WT Activerecord Index Spy project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

wt_activerecord_index_spy's People

Contributors

fabioperrella avatar lloydwatkin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wt_activerecord_index_spy's Issues

Error generating report

Describe the bug

ArgumentError:
  comparison of Array with Array failed
# (erb):12:in `sort_by'
# (erb):12:in `<main>'
# ./spec/rails_helper.rb:93:in `block (2 levels) in <top (required)>'

To Reproduce
Unsure without sharing code at this point, here is the line I use to generate the report:

WtActiverecordIndexSpy.export_html_results(File.new(File.join(Rails.root, "tmp/wt_activerecord_index_spy-results.html"), "w"))

Issue is here https://github.com/WeTransfer/wt_activerecord_index_spy/blob/main/lib/wt_activerecord_index_spy/results.html.erb#L12

Expected behavior
Report generated

Screenshots

Desktop (please complete the following information):

  • OS: Ubuntu 20.04
  • Database/version: MySql 5.7
  • Ruby version: Ruby 2.7.2
  • Activerecord version: Rails 5.2.4.6

It can't break application tests

I'm testing this gem in Storm and it has broken a few tests there.

The name of the branch is mysql-index-checker (this is the former name of this gem)

Improve PostgreSQL query analyser

It's very hard to figure out if a table needs a new index looking at the result of a PostgreSQL EXPLAIN query.

PostgreSQL's EXPLAIN result is not deterministic and varies depending on some internal state in the database statistics.

For example, I run a simple explain in a table which has a proper index, in 2 different moments, and got 2 different results:

[1] pry(main)> Feature.where(plan_id: 312312).explain
  Feature Load (4.0ms)  SELECT "features".* FROM "features" WHERE "features"."plan_id" = $1  [["plan_id", 312312]]
=> EXPLAIN for: SELECT "features".* FROM "features" WHERE "features"."plan_id" = $1 [["plan_id", 312312]]
                       QUERY PLAN
---------------------------------------------------------
 Seq Scan on features  (cost=0.00..1.06 rows=1 width=72)
   Filter: (plan_id = 312312)
(2 rows)

[2] pry(main)> Feature.count
   (2.8ms)  SELECT COUNT(*) FROM "features"
=> 5
[3] pry(main)> Plan.count
   (2.7ms)  SELECT COUNT(*) FROM "plans"
=> 2

###################################################################################################################

[1] pry(main)> Feature.where(plan_id: 312312).explain
  Feature Load (2.3ms)  SELECT "features".* FROM "features" WHERE "features"."plan_id" = $1  [["plan_id", 312312]]
=> EXPLAIN for: SELECT "features".* FROM "features" WHERE "features"."plan_id" = $1 [["plan_id", 312312]]
                                       QUERY PLAN
----------------------------------------------------------------------------------------
 Bitmap Heap Scan on features  (cost=4.18..12.64 rows=4 width=72)
   Recheck Cond: (plan_id = 312312)
   ->  Bitmap Index Scan on index_features_on_plan_id  (cost=0.00..4.18 rows=4 width=0)
         Index Cond: (plan_id = 312312)

So, I don't know how to be certain if an index is missing or not.

Maybe an alternative would be parsing the query to extract the filter conditions, describe the table to list the current indexes and compare if they match, but I would do it only as the last alternative because:

  • it's hard to parse a query and extract the filters
  • it would implement another kind of explain query

Explain queries may work better with larger datasets

Could it be an idea to try to install a "representative dataset" for testing the gem? For example using a fuller list of Brazilian municipalities + a longer list of people names, say generated with Faker. I suspect it is really dependent on the table size, both for MySQL and PostgerSQL, what the query planner will plan.

Indeed, I also think that the results would be better if using more data, but this will not be the case when it analyses queries running on unit tests, for example.

I will create an issue with that, maybe an idea can be born influenced by that!

Originally posted by @fabioperrella in #13 (comment)

Should ignore queries executed in test methods

It should ignore queries executed in test methods, for example:

# somefile_spec.rb
it 'does something'
  SomeService.run # <--- the queries which run inside of this method should be analysed

  result = Model.find_by(arg1: 'lala', arg2: 'popo') # <--- this query should not be analysed

  expect(result).to be_present
end

Write something in the README

Currently, the README is the default one that was created by Bundler.

We need to write instructions on how to use this gem

Ability to turn off for specific tests

Is your feature request related to a problem? Please describe.
When running rspec tests it would be great if I could annotate a test/suite/etc so it doesn't do any checking whilst the test is running.

Describe the solution you'd like
For example:

it 'does not need an index because the table has two rows', :skip_index_spy do
  expect(Model.active.count).to eq 2
end

Describe alternatives you've considered
I've been reworking my code up to this point as much as possible to get around the problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.