Rails 1.1 Performance Report (vs. 1.0)

Short Summary for the Impatient

We have worked hard on making Rails 1.1 perform on the same level as Rails 1.0. This goal has been achieved and in some cases we have improved performance significantly.

The following performance data table shows the speed difference for the fastest availabe configuration for the tested application.

page c1 totalc2 total c1 r/sc2 r/s c1 ms/rc2 ms/r c1/c2
/empty/index 7.861397.10545 636.0703.7 1.571.42 1.11
/welcome/index 8.387438.32744 596.1600.4 1.681.67 1.01
/rezept/index 8.870388.75717 563.7571.0 1.771.75 1.01
/rezept/myknzlpzl 8.861728.76325 564.2570.6 1.771.75 1.01
/rezept/show/713 22.2253020.12046 225.0248.5 4.454.02 1.10
/rezept/cat/Hauptspeise 25.2905124.70123 197.7202.4 5.064.94 1.02
/rezept/cat/Hauptspeise?page=5 25.9252825.40904 192.9196.8 5.195.08 1.02
/rezept/letter/G 25.0624224.96315 199.5200.3 5.014.99 1.00

Environment

Each test was run using railsbench on an Athlon64 3000+ with 1G of memory using Suse9.3, ruby 1.8.2 and Mysql 4.1, with RAILS_PERF_RUNS=5. For session storage, Mysql using mysql-ruby-2.7 was employed and Rails logging was disabled.

Configuration options

A number of configuration options were tested:

out of the box
This is of course the most natural thing to test
gc100
Manual garbage collection control: garbage collection was disabled and a collect was forced after 100 requests
patched_gc
Test with the garbage collector patch applied to ruby 1.8.2 with the following settings:
RUBY_GC_STATS=0
RUBY_HEAP_MIN_SLOTS=600000
RUBY_GC_MALLOC_LIMIT=59000000
RUBY_HEAP_FREE_MIN=100000
mysql_session
A Mysql native session class implementation
nat_routes
A hand coded routing implementation using only URLs of the form controller/action/id
links
use link_to calls instead of specifying URLs directly

Test Data

For the tests, I selected a number of pages from my recipe database application:

/empty/index a simple render_text
/welcome/index a welcome page, action cached
/rezept/index application front page, user dependent, action cached
/rezept/myknzlpzl my recipes, user dependent, action cached
/rezept/show/713 show recipe 713
/rezept/cat/Hauptspeise show all recipes of category Hauptspeise, paginated
/rezept/cat/Hauptspeise?page=5 page 5 of category Hauptspeise
/rezept/letter/G all recipes with a title starting with G

Performance data for each configuration can be found here.

One on one comparisons produced by script perf_comp for each configuration are listed in this file.

Logs of the the test runs:

If you're interested in profiler output for some of the benchmarks, study call trees and call graphs here.

Performance Charts

Out of the box 1.1 vs. 1.0

This one shows how well 1.1 does out of the box against 1.0, with no options, gc100 and patched GC. The numbers are requests per second.

Out of the box 1.1 vs. fully optimized 1.1

This one shows the speedup of all tuning options applied to 1.1, for the different GC options.

Tuning effects on 1.1 with patched GC

This one shows the speedup obtained by applying different tuning options to 1.1, all running with patched GC.

Optimized 1.0 vs. optimized 1.1

This one shows how well 1.1 does against 1.0, with no options, gc100 and patched GC, with all tuning options applied.

Footnote

If you study the charts carefully, and look at the performance numbers for the default Ruby garbage collector, you will discover that sometimes 1.1 performance is worse than 1.0 performance and sometimes better. Contrast this with the numbers for 1.1 obtained with the patched garbage collector, which are always equal or better than 1.0 numbers. This clearly shows that in order to obtain reliable performance data the patched GC should be used. The reasons are explained here. Note: I will update the patch with a newer version in a few days, so refrain from downloading the patch right now.