The case for piggy backed attributes
Posted 06 Nov 2005
If you're subscribed to the rails mailing list, you might have seen some people talking about piggy backing attributes onto SQL queries.
This is a feature of the Active Record ORM mapping that I like very much, because it can be used to speedup you queries enormously. And I hope it won't be eliminated, ever.
Suppose there's a 1:n relationship between models A and B, i.e., for each A record there's exactly one B record (think B owns A). In my case that would be recipes and users.
In ActiveRecord you'd have class declarations like this:
class User < ActiveRecord::Base
has_many :recipes
end
class Recipe < ActiveRecord::Base
belongs_to :user
end
Your usual code to retrieve some recipes with their associated authors will probably look like this:
def show_some
@recipes = Recipe.find(:all, :limit => 50)
end
with the show_some template being rendered implicitely and looking similar to:
<% for r in @recipes %>
<%= r.title %>, <%= r.user.name %>
<br>
<% end %>
There's nothing wrong with this code, except that it's kind of slow:
there will be 51 queries to the database, because for each recipe the owning user (author) will be fetched in a separate query
all these queries are constructed dynamically upon calling r.user.name
We can do a little better than that: since approx. 0.13.1 Rails supports fetching associated objects by specifying our intention to access them via the :include syntax for find (called eager loading of associations). So we'd actually write our query like this:
def show_some
@recipes = Recipe.find(:all, :limit => 50, :include => [:user])
end
But how much faster is this? The answer is: it depends. On how fast DB access is, for example. Whether your DB lives on another host, network speed, etc.
For a simple installation on one box, I have measured this (using railsbench, 1000 requests each, patched GC):
page request total stddev% r/s ms/r /rezept/some_default 17.83765 0.2579 56.1 17.84 /rezept/some_include 13.25508 0.3499 75.4 13.26
Speedup is 34%, which is OK, but not spectacular.
We can do better than that using piggy backed attributes. First, we'll change the query to:
Recipe.find(:all, :limit => 50,
:conditions => "r.user_id=u.id",
:joins => 'r, users u',
:select => 'r.*, u.name AS user_name')
We still have only one query to the DB, this time involving a join, which could be slow. But don't worry, databases were constructed to perform such queries as fast as possible.
In addition to the recipe columns we now get 'user_name' included in the retrieved records. To make access to this attribute fast, we'll add a user_name function to our model:
class Recipe < ActiveRecord::Base
belongs_to :user
def user_name
@attributes['user_name']
end
end
and change our template to:
<% for r in @recipes %>
<%= r.title %>, <%= r.user_name %>
<br>
<% end %>
Is this faster? Answer: you can bet!
page request total stddev% r/s ms/r /rezept/some_default 17.83765 0.2579 56.1 17.84 /rezept/some_include 13.25508 0.3499 75.4 13.26 /rezept/some_piggy 4.19032 0.1059 238.6 4.19
It's whopping 4 times faster than the first version. And 3 times faster than the :include version.
I can almost hear you scream "But this code sucks! The original version was sooo much nicer. I hate adding this extra function to my model." My answer is: if your app is fast enough for you, don't do it. If it isn't, this is a nice opportunity for speedup.
Also, if your rails app lives in a shared hosting environment, it means being nice to everyone else to make it faster.
And if you weren't looking for speed, you wouldn't be reading this blog in the first place ;-)