Giter VIP home page Giter VIP logo

sql_mapper's Introduction

sql_mapper

An extension for ActiveRecord to improve read performance for large data sets at the sacrifice of some AR magic.

The basic use case for SQL Mapper is:

  1. You need read only data
  2. Your data is flat and can be contained within a single query.
  3. Normal use of ActiveRecord is not performant enough
  4. You are not dependent upon ActiveRecord magic to do the job after you have the data.

If all of the above is true, sql_mapper can help leverage raw sql to provide large performance gains over ActiveRecord's existing fetch capabilities while still coercing the results into objects for ease of use. How much more performant? My benchmarks show greater than an order of magnitude increase in performance when fetching 100,000 rows of data.

1.9.3p194 :036 >   prof 'Foo.all' do
1.9.3p194 :037 >       Foo.all
1.9.3p194 :038?>   end
Foo.all: 7492.341262

1.9.3p194 :043 >   prof 'SqlMapper' do
1.9.3p194 :044 >       ActiveRecord::SqlMapper.fetch :query => "select * from foos"
1.9.3p194 :045?>   end
SqlMapper: 474.20788899999997

Why is ActiveRecord slow in these cases? Read more here http://merbist.com/2012/02/23/quick-dive-into-ruby-orm-object-initialization/

Why raw sql? Its a data fetching DSL.

Examples

All of the examples assume we have a table named Foos defined as follows:

create table Foos (
  id serial,
  name string
);

Inline SQL

You can fetch results using raw inline SQL. By default, The results will be marshalled into structs with attributes matching the column names in your query.

foos = ActiveRecord::SqlMapper.fetch :query => "select * from foos"

This is analogous to, but more performant than, using ActiveRecord's all method.

foos = Foo.all

Accessing Column Values

As the result objects are structs, you can access column data through the object using column names and dot notation.

foos = ActiveRecord::SqlMapper.fetch :query => "select * from foos"
foos.each do |foo|
  puts "#{foo.id}: #{foo.name}"
end

Single Result Shortcut

A fetch_one shortcut exists to fetch a single result. All options and behavior apply to both fetch and fetch_one.

foo = ActiveRecord::SqlMapper.fetch_one :query => "select * from foos limit 1"

Named Queries

SQL queries can be mapped to logical names within a configuration block, allowing you to pass the logical name to the fetch method and keeping the sql compartmentalized within your app. In a rails application, this configuration should be put in an initializer.

ActiveRecord::SqlMapper.config do
  map :all_foos, "select * from foos"
end

foos = ActiveRecord::SqlMapper.fetch :query => :all_foos

Parameters

SQL queries can contain parameters using ? or :name placeholders that you are already familiar with from ActiveRecord. These can be used in both inline and named queries.

sql = "select * from foos where id = ?"
foo = ActiveRecord::SqlMapper.fetch_one :query => sql, :params => 1
foo = ActiveRecord::SqlMapper.fetch_one :query => sql, :params => [1]

sql = "select * from foos where id = :id"
foo = ActiveRecord::SqlMapper.fetch_one :query => sql, :params => {:id => 1}

Result Classes

By default, sql mapper results are structs, but you can also use hashes by specifying the result class to fetch or fetch_one.

sql = "select * from foos where id = ?"
foo = ActiveRecord::SqlMapper.fetch :query => sql,
                                     :params => 1,
                                     :result_class => Hash
puts foo[:id]
puts foo[:name]

You can also use any arbitrary class for results as long as it has an initializer that contains arguments for all columns in the same order. This can be useful if you want behavior attached to your results.

class Foo
  attr_reader :id, :name

  def initialize(id, name)
    @id = id
    @name = name
  end

  def to_s
    "Foo(#{id}, #{name})"
  end
end

sql = "select * from foos where id = ?"
foo = ActiveRecord::SqlMapper.fetch_one :query => sql,
                                        :params => 1,
                                        :result_class => Foo
puts foo.to_s

Result classes can also be specified in the configuration at either a global or per-query level.

ActiveRecord::SqlMapper.config do
  result_class Hash
  map :all_foos, "select * from foos", Foo
end

More Examples

See examples of sql_mapper use at https://gist.github.com/4000974

Versions

Tests have been run and verified with ActiveRecord 3.2, 2.3.8, 2.3.5. Let me know if you have any problems with the gem using active record > 2.3.5.

FAQ

Q: Am I missing something? Doesn't Arel already do this?

A: Arel is an abstract syntax tree for generating SQL. Sql_mapper is a very simple extension to ActiveRecord that allows you to:

  1. Use native sql for the best possible tuned performance in read-only queries.
  2. Coerce the data into objects without having to do it yourself.
  3. Avoid the performance overhead involved with instantiating full-blown ActiveRecord::Base instances.

Q: How is this different from doing ActiveRecord::Base.connection.select_all() or whatever?

A: Its not very different, and there is probably no reason to refactor existing code using that method. There are, however, reasons to use sql_mapper instead of that approach if confronted with the issues it is designed to solve:

  1. select_all has different results depending on which version of active record you are using (array of arrays or array of hashes).
  2. person.name > person[0] for all sorts of reasons.
  3. person["name"] is better, but a departure from how you would access data via dot syntax with full blown active record objects. For instance, if you are tasked with "report A takes 10 minutes to generate! FIX IT!" and the code to generate report A is using standard active record objects, it will output each person's name with person.name. You would either have to refactor every place you access data to use the hash syntax, or coerce the data into an object for dot syntax. Sql_mapper does the coercion for you in a really efficient way and its DRY (really DRAAEHDA -- don't repeat anything anyone else has done already).
  4. Also, Named queries with sql_mapper allow you to have a logical name for the query that is probably easier to grok at first glance than a really complex query that is 100 lines long and mixed in with your ruby source.
ActiveRecord::SqlMapper.fetch :data_for_invoice_report

ActiveRecord::Base.select_all("""
SELECT DATEPART(m, Invoice.InvoiceDate) month, 
       DATEPART(yy, Invoice.InvoiceDate) year, 
       Reseller.Name, 
       SUM(jobstockitems_hardware.Price) sales_hardware,
       SUM(jobstockitems_consumables.Price) sales_consumables, 
FROM Invoice
INNER JOIN Reseller
ON Invoice.CustomerID = Reseller.ID
INNER JOIN Job
ON Invoice.ID = Job.InvoiceID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price 
           FROM JobStockItems 
           INNER JOIN Stock 
           ON JobStockItems.StockID = Stock.StockID
           AND Stock.Category1 = 'Hardware'
           GROUP BY JobID) jobstockitems_hardware
ON Job.ID = jobstockitems_hardware.JobID
LEFT JOIN (SELECT JobID, SUM(PriceExTax) Price 
           FROM JobStockItems 
           INNER JOIN Stock 
           ON JobStockItems.StockID = Stock.StockID
           AND Stock.Category1 = 'Consumables'
           GROUP BY JobID) jobstockitems_consumables
ON Job.ID = jobstockitems_consumables.JobID
GROUP BY DATEPART(m, Invoice.Date), 
         DATEPART(yy, Invoice.Date), 
         Reseller.Name
ORDER BY DATEPART(yy, Invoice.Date) ASC, 
         DATEPART(m, Invoice.Date) ASC, 
         Reseller.Name ASC
""")

Contributing

Fork it, hack it, test it, then I'll pull it if I like it.

sql_mapper's People

Stargazers

 avatar Pablo Jairala avatar Lance Woodson avatar  avatar Pascal Jungblut avatar Adrien Coquio avatar Carlo Bertini avatar paul bonner avatar Lloyd Pick avatar  avatar Gerald Bauer avatar Dustin Schneider avatar

Watchers

Haifeng Cao avatar James Cloos avatar

sql_mapper's Issues

Date/Time converter implementation

  • Right now, any date/time columns in a result set will be returned as strings.
  • Need to provide a date/time converter implementation (see issue 1) to convert the string value into a Time object.
  • Time parsing in ruby is notoriously slow. Need to do some investigation into what the fastest possible conversion technique is.
  • Document the date/time converter in the readme.

Add _columns method to Struct results.

  • Currently, the struct results of a query will contain no behavior beyond having attributes matching column names.
  • It would be nice to be able to introspect a result struct for the columns/attributes it has.
    • reflective programming
    • tinkering in IRB.
  • Add a _columns method to the struct (see http://blog.rubybestpractices.com/posts/rklemme/017-Struct.html) that will return an array of column/attribute symbols in the struct.
  • Add an _attributes alias for _columns.

Create integer converter (pg/postgres)

  • The pg adapter for postgres returns integers as strings.
  • Add a converter (see issue #1) to convert the string to an integer.
  • Do we need to add for floats, too?
  • Document in readme.

Add ability to convert columns at fetch time

  • Add ability to map column name to lambda that receives the raw data for a column and converts it in some way before storing it in a result struct.
  • Should not be necessary by default.
  • Should be efficient, don't do if this slows things down. Performance before and after feature addition should not be measurably different.
  • Document in the readme.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.