This site is dedicated to further knowledge about creating Ruby on Rails applications professionaly. We discuss Ruby on Rails features from a performance angle, discuss Ruby on Rails performance analysis methods, provide information on Ruby on Rails scaling and benchmark Ruby on Rails performance for each release. We discuss best practices for selecting Ruby on Rails session containers, fragment and page caching and optimizing database queries.

The case for piggy backed attributes

Posted 06 Nov 2005

If you're subscribed to the rails mailing list, you might have seen some people talking about piggy backing attributes onto SQL queries.

This is a feature of the Active Record ORM mapping that I like very much, because it can be used to speedup you queries enormously. And I hope it won't be eliminated, ever.

Suppose there's a 1:n relationship between models A and B, i.e., for each A record there's exactly one B record (think B owns A). In my case that would be recipes and users.

In ActiveRecord you'd have class declarations like this:

class User < ActiveRecord::Base
  has_many :recipes
end
class Recipe < ActiveRecord::Base
  belongs_to :user
end

Your usual code to retrieve some recipes with their associated authors will probably look like this:

def show_some
  @recipes = Recipe.find(:all, :limit => 50)
end

with the show_some template being rendered implicitely and looking similar to:

<% for r in @recipes %>
  <%= r.title %>, <%= r.user.name %>
  <br>
<% end %>

There's nothing wrong with this code, except that it's kind of slow:

  • there will be 51 queries to the database, because for each recipe the owning user (author) will be fetched in a separate query

  • all these queries are constructed dynamically upon calling r.user.name

We can do a little better than that: since approx. 0.13.1 Rails supports fetching associated objects by specifying our intention to access them via the :include syntax for find (called eager loading of associations). So we'd actually write our query like this:

def show_some
  @recipes = Recipe.find(:all, :limit => 50, :include => [:user])
end

But how much faster is this? The answer is: it depends. On how fast DB access is, for example. Whether your DB lives on another host, network speed, etc.

For a simple installation on one box, I have measured this (using railsbench, 1000 requests each, patched GC):

page request               total  stddev%     r/s    ms/r
/rezept/some_default    17.83765   0.2579    56.1   17.84
/rezept/some_include    13.25508   0.3499    75.4   13.26

Speedup is 34%, which is OK, but not spectacular.

We can do better than that using piggy backed attributes. First, we'll change the query to:

Recipe.find(:all, :limit => 50,
            :conditions => "r.user_id=u.id",
            :joins => 'r, users u',
            :select => 'r.*, u.name AS user_name')

We still have only one query to the DB, this time involving a join, which could be slow. But don't worry, databases were constructed to perform such queries as fast as possible.

In addition to the recipe columns we now get 'user_name' included in the retrieved records. To make access to this attribute fast, we'll add a user_name function to our model:

class Recipe < ActiveRecord::Base
  belongs_to :user
  def user_name
    @attributes['user_name']
  end
end

and change our template to:

<% for r in @recipes %>
  <%= r.title %>, <%= r.user_name %>
  <br>
<% end %>

Is this faster? Answer: you can bet!

page request               total  stddev%     r/s    ms/r
/rezept/some_default    17.83765   0.2579    56.1   17.84
/rezept/some_include    13.25508   0.3499    75.4   13.26
/rezept/some_piggy       4.19032   0.1059   238.6    4.19

It's whopping 4 times faster than the first version. And 3 times faster than the :include version.

I can almost hear you scream "But this code sucks! The original version was sooo much nicer. I hate adding this extra function to my model." My answer is: if your app is fast enough for you, don't do it. If it isn't, this is a nice opportunity for speedup.

Also, if your rails app lives in a shared hosting environment, it means being nice to everyone else to make it faster.

And if you weren't looking for speed, you wouldn't be reading this blog in the first place ;-)

Posted in performance | Tags piggyback

Comments

blog comments powered by Disqus