AppNeta releases their TraceView Ruby instrumentation code as open source

We just opened the Github repository to the public and announced it here. The TraceView Ruby instrumentation gives you performance data like this, this, this, this and even this.

If any of you are using it or give it a run and have any issues/questions - feel free to ping me anytime at pglombardo at or via a support ticket through AppNeta. I’d love to get as much feedback as possible. They have a free-forever project-level plan that always shows the last 24 hours of performance data.

Some links:

Why is my site slow?

The War on ActionView with Russian Doll Caching

Rails 4 is out featuring Russian Doll caching (AKA Cache Digests). In this article, I apply Russian Doll caching to one of my poorer performing Rails 3 pages using the cache_digests gem.

This article was originally featured on AppNeta’s Application Performance Blog. Check it out!

ActionView templates are great. They are easy to code, manage and extend but the one thing they are not is fast…at least not out of the box.

In this article, I’ll be using AppNeta’s TraceView to time ActionView performance. If you haven’t used TraceView before, checkout my previous article Instrumenting Ruby on Rails with TraceView.

ActionView is Slow; Pitfalls Ahead

ActionView puts forth a great development pattern of layouts, views and partials that is easy to understand, implement and maintain but that comes at a cost: The rendering process is complex and slow.

Full-size image

The screenshot above shows the timings for the Users#show URL on Gameface. The page in question is fairly straight forward containing four components: a topbar, sidebar, user details and a listing of game characters.

With no caching at all, the ActionView layer averages roughly ~365ms for this URL. This represents over 80% of average request processing time and dwarfs all of the other layers combined.

Full-size image

In terms of performance, ActionView is a weapon of mass destruction and is the low-hanging fruit for improvement.

Russian Doll Caching

Russian Doll caching is a type of nested fragment caching that auto expires fragments when object timestamps change.

You still make the same calls as previous fragment caching schemes in Rails:

- cache @user do
  (user view data)

  - cache [ 'details', @user ] do
    (user details view data)

  - cache [ 'characters', @user ] do
    - @user.characters.each do |character|

      - cache character do
        (character view data)

With Russian Doll caching (AKA cache digests) unique cache keys are formed using an md5 stamp based on the timestamp of the object being cached:

views/users/3-20130530135425/7a1bb8bb15b02ee7aa69cec1d5f6f630 views/details/users/3-20130530135425/6f28ec6d31e7e3b73a575777d59e63ca

The advantage of this is that when objects are updated and outer fragments are automatically invalidated, nested fragments can be re-used. (russian dolls)

A key requirement to this is that children objects should update the timestamps of their parent object by using ActiveRecord touch option. This will allow for automatic cache invalidation and avoid serving stale content.

class Character < ActiveRecord::Base
    belongs_to :user, touch: true

For a more thorough explanation of Russian Doll Caching, see Ryan Bates cache digests episode or this explanation from Remarkable Labs.

Cache Friendly Page Layouts

When caching, it’s best to avoid caching logged-in content since the same caches will get served to all users regardless of any logged in status.

For effective (and less problematic) fragment caching, designing the page well to separate out logged in content is critical.

Below is the layout for the Gameface profile view before re-design. Logged in specific content and links are sprinkled throughout the page making it hard to divide it up into cache-able fragments.

Full-size image

Properly caching this page as-is would be complicated and inefficient since we would have to tip-toe around logged-in content.

The Redesign

To fix this, I re-organized the page to group the logged-in specific content into one specific area. With logged-in content out of the way, we are then free to cache the rest of the page in well-defined fragments.

Full-size image

The Results

With the page re-design and Russian Doll fragment caching applied to the large majority of the page, we now average a much better ~120ms for ActionView on that URL. A reduction of 265ms or 67% in average processing time.

Full-size image

On top of the performance improvement, we also get automatic cache invalidation as object timestamps are updated. This greatly simplifies the whole system by not requiring cache sweepers or other tricks to invalidate stale caches.


Russian Doll caching is a small but significant improvement over prior caching in Rails. When used effectively, it can greatly reduce server side ActionView processing and automatically expire stale fragments.

We took a previously un-cached URL with a poor page layout that was averaging ~365ms of processing in ActionView and reduced that number to ~120ms for a 67% performance improvement.

Additional Considerations

When using fragment caching, note which backing cache store you are using. The default cache store in Rails is the filesystem but you can get even greater performance by backing your cache store with memcache or redis instead.

See the redis-rails gem as an example.

Next Up

Coming soon, I’ll revisit this page under Rails 4 to further explore performance improvements over Rails 3.


Separate out logged-in content from agnostic content for best cache coverage. Page design affects caching efficiency.

When possible, always call cache with the object being cached. Cache keys and their eventual invalidation will be keyed off of the timestamp on the object being cached.

- cache @user do
  (user view data)

  - cache [ 'details', @user ] do
    (user details view data)

To cache an index of objects, you have to revert to manually passing in the expires_in option since there is no single object timestamp to key off of.

- cache 'user list', expires_in: 30.minutes do
  (user list view data)

Update belongs_to model relationships with touch: true so that parent fragment caches can be invalidated when children are updated.

Collect timing data before and after to quantify and validate changes.

Instrumenting Ruby on Rails with TraceView in under 10 minutes

Update May 14, 2013: AppNeta now offers free tracing for single projects. Check it out on their pricing page.

TraceView by AppNeta provides deep performance monitoring of web applications.

It gives you insight into your web application performance such as this:

and a per request drill-down that shows you the nitty gritty detail of where time is spent in individual requests (full-size):

and even end-user monitoring:

I run it on Gameface and PasswordPusher - it’s an essential tool in identifying problem areas, performance bottlenecks and simply poor performing code. (Read: ActionView)

Disclaimer: I authored the Ruby instrumentation for Traceview so I may be a bit biased. …but with good reason!


Installing TraceView consists of two parts: 1) installing the system daemon on your host and 2) installing the Ruby gem in your application.

Why a system daemon? TraceView uses a system daemon to collect instrumentation from sources beyond application code such as host metrics, Apache or Resque.

The system daemon is installed with two commands that can be pasted into your shell. An account specific version of these commands are available in your TraceView dashboard once you create an account. (Under Settings; App Configuration; Trace New Hosts)

And the gem for your application Gemfile available on RubyGems:

gem 'oboe'


TraceView functions by sampling a subset of all requests that go through your web application. This sample rate must be set at the entry point of requests in your application. This can be a load balancer, Apache/nginx or Ruby itself. Successive hosts and software stacks that requests pass through will act appropriately according to upstream settings.

Yes. That means requests can be traced across hosts, software stacks even track internal API calls via HTTP/SOAP/REST which make for spectacular application insight. But that’s another post for another time.

For this walkthrough we’re going to assume that you’re running a Ruby on Rails application with Unicorn, Thin, Puma or any other Rack based Ruby webserver.

To instead configure Apache/nginx as the entry point, see here.

Ruby as the Entry Point

If you setup Apache or nginx as your entry point then you can skip this part entirely. The oboe gem will take it’s tracing instructions from upstream automatically.

When the oboe gem is installed in your application, it makes available Oboe::Config which is a nested hash of configuration options that affect how your application is instrumented.

Luckily, the defaults are very smart and only a couple initial values need to be configured.

The two required values are:

Oboe::Config[:tracing_mode] = 'always'
Oboe::config[:sample_rate] = 100000  # 10% of incoming requests

These values enable tracing (:tracing_mode) and sets the sample rate to 10% (:sample_rate). The sample_rate is calculated by the value out of one million. e.g. 300000 would equal a 30% sample rate meaning 3 out of 10 requests would be sampled. (For low traffic sites, you may want to set these values higher.)

These values are usually set in a Rails initializer for Ruby on Rails. See the Optional Rails Initializer section on this page. A complete list of all configuration options for Oboe::Config is here.

Getting Performance Data

And that’s it. Now to the good stuff.

If you want to test that the oboe gem is functional, start-up a Rails console, you should see a message to stdout similar to:

Tracelytics oboe gem successfully loaded.

Note that on hosts that don’t have the system daemon installed, the oboe gem disables itself and outputs a message to that fact.

Unsupported Tracelytics environment (no libs).  Going No-op.

Deploy/restart your application and you should start seeing traces show up in your TraceView dashboard after a couple minutes.


Installation and setup of TraceView for your application is a simple two step process that can be done in 10 minutes or less. Traceview gives a unique in-depth view into requests even as they cross hosts and software stacks.

Things are moving fast for the Ruby language instrumentation in TraceView. We already support tracing of memcache-client, memcached, dalli, mongo, moped, mongoid, mongomapper, cassandra, ActiveRecord (postgres, mysql, mysql2) plus more. Most recently we added support for Rack and Resque tracing. For a full-list of supported libraries, see the top of this article.

If you haven’t tried out TraceView yet, give it a run. You won’t be disappointed.

Extras: Some Random Chart Porn

A screenshot that I sent to Linode when performance unexpectedly dropped: image

Linode migrated my VPS to a lesser utilized host with evident results (Thanks Linode): image

An older issue that Gameface had with atrocious rendering times: image