Advertisement

Improve Your App’s Performance with Memcached

by

This Cyber Monday Tuts+ courses will be reduced to just $3 (usually $15). Don't miss out.

One of the easiest ways to improve your application's performance is by putting a caching solution in front of your database. In this tutorial, I'll show you how to use Memcached with Rails, Django, or Drupal.

Memcached is an excellent choice for this problem, given its solid history, simple installation, and active community. It is used by companies big and small, and includes giants, such as Facebook, YouTube, and Twitter. The Memcached site, itself, does a good job of describing Memcached as a "Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load."

In general, database calls are slow.

In general, database calls are slow, since the query takes CPU resources to process and data is (usually) retrieved from disk. On the other hand, an in-memory cache, like Memcached, takes very little CPU resources and data is retrieved from memory instead of disk. The lightened CPU is an effect of Memcached's design; it's not queryable, like an SQL database. Instead, it uses key-value pairs to retrieve all data and you cannot retrieve data from Memcached without first knowing its key.

Memcached stores the key-value pairs entirely in memory. This makes retrieval extremely fast, but also makes it so the data is ephemeral. In the event of a crash or reboot, memory is cleared and all key-value pairs need to be rebuilt. There are no built-in high-availability and/or fail-over systems within Memcached. However, it is a distributed system, so data is stored across multiple nodes. If one node is lost, the remaining nodes carry on serving requests and filling in for the missing node.

Installing Memcached

Installing Memcached is a fairly simple process. It can be done through a package manager or by compiling it from source. Depending on your distribution, you may want to compile from source, since the packages tend to fall a bit behind.

# Install on Debian and Ubuntu
apt-get install memcached

# Install on Redhat and Fedora
yum install memcached

# Install on Mac OS X (with Homebrew)
brew install memcached

# Install from Source
get http://memcached.org/latest
tar -zxvf memcached-1.x.x.tar.gz
cd memcached-1.x.x
./configure
make && make test
sudo make install

You'll want to configure Memcached for your specific needs, but, for this example, we'll just get it running with some basic settings.

memcached -m 512 -c 1024 -p 11211 -d

At this point, you should be up and running with Memcached. Next, we'll look at how to use it with Rails, Django and Drupal. It should be noted that Memcached is not restricted to being used within a framework. You can use Memcached with many programming languages through one of the many clients available.

Using Memcached with Rails 3

Rails 3 has abstracted the caching system so that you can change the client to your heart's desire. In Ruby, the preferred Memcached client is Dalli.

# Add Dalli to your Gemfile
gem 'dalli'

# Enable Dalli in config/environments/production.rb:
config.perform_caching = true
config.cache_store = :dalli_store, 'localhost:11211'

In development mode, you will not normally hit Memcached, so either start Rails in production mode with rails server -e production, or add the above lines to your config/environments/development.rb.

The simplest use of the cache is through write/read methods to retrieve data:

Rails.cache.write 'hello', 'world'      #=> true
Rails.cache.read 'hello'                #=> "world"

The most common pattern for Rails caching is using fetch. It will attempt to retrieve the key (in this case, expensive-query) and return the value. If the key does not exist, it will execute the passed block and store the result in the key.

Rails.cache.fetch 'expensive-query' do
  results = Transaction.
    joins(:payment_profile).
    joins(:order).
    where(':created > orders.created_at', :created => Time.now)
end
# ... more code working with results

In the example above, the problem is cache expiry. (One of the two hard problems in computer science.) An advanced, very robust solution is to use some part of the results in the cache key itself, so that if the results change, then the key is expired automatically.

users = User.active
users.each do |u|
  Rails.cache.fetch "profile/#{u.id}/#{u.updated_at.to_i}" do
    u.profile
  end
end

Here, we're using the epoch of updated_at as part of the key, which gives us built in cache expiration. So, if the user.updated_at time changes, we will get a cache miss on the pre-existing profile cache and write out a new one. In this case, we'll need to update the user's updated_at time when their profile is updated. That is as simple as adding:

class Profile < ActiveRecord::Base
  belongs_to :user, touch: true
end

Now, you have self-expiring profiles without any worry about retrieving old data when the user is updated. It's almost like magic!

Using Memcached with Django

Once you have Memcached installed, it is fairly simple to access with Django. First, you'll need to install a client library. We'll use pylibmc.

# Install the pylibmc library
pip install pylibmc

# Configure cache servers and binding settings.py
CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.PyLibMCCache',
        'LOCATION': '127.0.0.1:11211',
    }
}

Your app should be up and running with Memcached now. Like other libraries, you'll get basic getter and setter methods to access the cache:

cache.set('hello', 'world')
cache.get('hello')             #=> 'world'

You can conditionally set a key if it does not already exist with add. If the key already exists, the new value will be ignored.

cache.set('hello', 'world')
cache.add('hello', 'mundus')
cache.get('hello')              #=> 'world'

From the Python Decorator Library, you can create create a memoized decorator to cache the results of a method call.

import collections
import functools

class memoized(object):
    '''Decorator. Caches a function's return value each time it is called.
    If called later with the same arguments, the cached value is returned
    (not reevaluated).
    '''
    def __init__(self, func):
        self.func = func
        self.cache = {}
    def __call__(self, *args):
        if not isinstance(args, collections.Hashable):
            # uncacheable. a list, for instance.
            # better to not cache than blow up.
            return self.func(*args)
        if args in self.cache:
            return self.cache[args]
        else:
            value = self.func(*args)
            self.cache[args] = value
            return value
    def __repr__(self):
        '''Return the function's docstring.'''
        return self.func.__doc__
    def __get__(self, obj, objtype):
        '''Support instance methods.'''
        return functools.partial(self.__call__, obj)

@memoized
def fibonacci(n):
    "Return the nth fibonacci number."
    if n in (0, 1):
        return n
    return fibonacci(n-1) + fibonacci(n-2)

print fibonacci(12)

Decorators can give you the power to take most of the heavy lifting out of caching and cache expiration. Be sure to take a look at the caching examples in the Decorator Library while you are planning your caching system.

Using Memcached with Drupal

Getting started with Memcached in Drupal starts with installing the PHP extension for Memcached.

# Install the Memcached extension
pecl install memcache

<?php
    // Configure Memcached in php.ini
    [memcache]
    memcache.hash_strategy = consistent
    memcache.default_port = 11211
?>

<?php
    // Tell Drupal about Memcached in settings.php
    $conf['cache_backends'][] = 'sites/all/modules/contrib/memcache/memcache.inc';
    $conf['cache_default_class'] = 'MemCacheDrupal';
    $conf['memcache_key_prefix'] = 'app_name';
    $conf['memcache_servers'] = array(
        '10.1.1.1:11211' => 'default',
        '10.1.1.2:11212' => 'default'
    );
?>

You'll need to restart your application for all the changes to take effect.

As expected, you'll get the standard getter and setter methods with the Memcached module. One caveat is that cache_get returns the cache row, so you'll need to access the serialized data within it.

<?php
    cache_set('hello', 'world');
    $cache = cache_get('hello');
    $value = $cache->data;  #=> returns 'world'
?>

And just like that, you've got caching in place in Drupal. You can build custom functions to replicate functionality such as cache.fetch in Rails. With a little planning, you can have a robust caching solution that will bring your app's responsiveness to a new level.

And You're Done

While a good caching strategy takes time to refine, it shouldn't stop you from getting started.

Implementing a caching system can be fairly straightforward. With the right configuration, a caching solution can extend the life of your current architecture and make your app feel snappier than it ever has before. While a good caching strategy takes time to refine, it shouldn't stop you from getting started.

As with any complex system, monitoring is critical. Understanding how your cache is being utilized and where the hotspots are in your data will help you improve your cache performance. Memcached has a quality stats system to help you monitor your cache cluster. You should also use a tool, like New Relic to keep an eye on the balance between cache and database time. As an added bonus, you can get a free 'Data Nerd' tshirt when you sign-up and deploy.

Advertisement