This document is some of my notes culled from a day of research into Drupal performance tuning. I’ve broken it down into several different levels that all impact a Drupal site’s performance. For best results, apply some or all recommendations from each of these levels. They are ordered from project-specific to server-specific, with the final one being testing. There’s also a lather-rinse-repeat cycle at play here.


Project-level Optimizations

  • Do not wait until day of launch to consider performance!

    • Client should not be the first person to report performance issues!
  • Staging (aka launch prep) needs to be a part of the process — don’t go from dev to launch w/o analysis

  • Module development needs to use cache tables (or direct API access to memcache, etc.) when applicable

    • Reduce node_load()s, node_save()s, etc.
  • Project managers need to build-in time for performance tuning on client server, plus a pre-launch analysis of their server environment

  • If setting up a new server, DO NOT LAUNCH WITH APACHE/MYSQL/PHP DEFAULTS FROM MEDIATEMPLE, ZOMG1!1

    • JR TODO: have a document for all our default configurations & recommendations

Drupal-level Optimizations

Drupal Cache

  • Best solution: Get those cache_* tables out of MySQL!

    • Memcache + memcache module
    • Memcache/APC + Cache Router module
  • Turn built-in caching on (for anonymous users only)

    • Minimum cache lifetime = the amount of time this cache will stay active in cache table
      • Set to 1 hour means that a node object’s cache would be updated within the hour
    • page_fast_cache aggressive caching may work well for some cases

Eliminate 404 errors

  • always add this directive to .htaccess:

      <FilesMatch "\.(png|gif|jpe?g|s?html?|css|js|cgi|ico|swf|flv|dll)$">
        ErrorDocument 404 default
      </FilesMatch>

    There is no need for Drupal to return a 404 error for these objects! Most of the time it is in the background anyway.

Logging

  • Decrease PHP’s error reporting level to avoid unnecessary logging (or fix the warnings)

  • DB Logging

    • Drupal logs to database by default. (bad bad bad bad)
    • Watchdog table can bloat quickly if not regularly pruned.
      • adjust settings admin -> config -> logging
    • Watchdog table should never be MyISAM (never ever ever)
  • Syslog

    • Writes calls to watchdog() to the operating system, e.g. syslogd, syslog-ngd

    • Better choice overall as syslogd et al. can be configured to sort messages into separate files, e-mail log results based on regular expression triggers, etc.

    • Only available to us if we have root/sudo access to be able to configure syslog properly

    • Possible Drupal contrib modules to aid in monitoring syslog’d watchdog?

    • http://mattdanger.net/2011/02/setup-syslog-on-drupal-6-x-and-7-x/ (Setup Syslog on Drupal 6.x and 7.x)

    • Syslog-ng config (/etc/syslog-ng.conf on arch): https://wiki.archlinux.org/index.php/Syslog-ng

          destination d_drupal { file("/var/log/drupal.log"); };
          filter f_drupal { program("drupal"); };
          log { source(src); filter(f_drupal); destination(d_drupal); };

Pressflow

  • Performance-enhanced replacement for standard Drupal distro

  • Supports db replication workflow, can make read only db calls

  • External page cache

  • Fast path alias detection

  • Reverse proxy support (e.g. Varnish)

  • First path argument whitelist

  • Native JSON encoding

  • Removal of LOWER() for user-lookup queries

Other tuning

  • Disable modules you are not using

    • Schema
    • Devel (and related modules)
    • Views UI (and other modules where UI is separated from engine)
    • Statistics (use Google Analytics if you want stats)
  • Prune Sessions table by adjusting max lifetime of sessions:

    ini_set('session.gc_maxlifetime', 86400); i.e. 24 hours (in secs) ini_set('session.cache_expire', 1440); i.e. 24 hours (in mins)

    • Anonymous users: Reduce the cookie lifetime (which also changes for auth users, so can’t disable completely): ini_set('session.cookie_lifetime', 86400); i.e. 24 hours (in secs)

PHP-level Tuning (i.e. post-Drupal, pre-MySQL)

Opcode Cache

  • eAccelerator may be unstable with recent versions of PHP and Drupal.

XCache

  • Related to lighttpd project (I heart lighttpd)
  • Available in Debian/Ubuntu core repo

APC

  • Increase apc.shm_size from default of 32MB to 48MB

    • Just simply loading all enabled Drupal modules may exceed 32MB limit, rendering cache ineffective
  • /dev/zero fix for shm_size limits in OS layer

  • Consult apc.php which will output apc utilization stats (May be available in Drupal module?)

  • http://2bits.com/articles/importance-tuning-apc-sites-high-number-drupal-modules.html#comment-1228:

    Those who have limited resources for APC may consider using apc.filters. Setting it to include only very common files results in very high hit percentage with a limited memory.

    Example to exclude admin files and include only php and inc files apc.filters = "+inc$,+php$,-admin"

    Or for very limited resources you can also consider excluding contributed modules so you can cache Drupal core only. apc.filters = "+inc$,+php$,-admin,-sites/all/modules"

Memcache

  • Requirements:

    • Memcached daemon
    • PECL memcached package installed
    • Memcache Drupal module
    • Plenty of available RAM (the more the merrier)
  • Memcache Drupal module:

    • API for using memcache in general
    • Drop-in replacement for cache_get/cache_set
    • Drop-in replacement for sessions (avoiding db calls for sessions!)
  • Recommendations

    • At minimum, just use memcache for Drupal cache tables — much faster than db layer
    • Custom modules doing a lot of db work can use API available from memcache module
  • Previous to 6.x-1.5 of memcache module, need to use multiple memcached processes for each bin — but now one bin to rule them all

    • Counterpoint: you can still use multiple bins, better for keeping track of utilization

How much memory do you need (for memcache)? Look at your object sizes in the mysql cache tables and add them up. (hint: probably not a lot). You don’t need to be exact, you just need to get it within the correct order of magnitude.

Other tuning

  • Cache Router module http://drupal.org/project/cacherouter
    • CacheRouter is a caching system for Drupal allowing you to assign individual cache tables to specific cache technology.
    • Memcache could handle certain bins, and APC could handle others
    • NOTE Cache Router does not support session caching in memcache

Apache-level Tuning

Consider fastCGI (mod_fastcgid)

  • Under mod_php, every apache process includes the memory footprint of PHP regardless of whether it is serving PHP

    • CSS/JS, images, PDFs, etc. (_static content_) all require the same base memory footprint as the initial request for Drupal
    • Over time, the required memory per apache child process becomes the largest required memory for a request, i.e. quite a bit larger than needed for static content
  • Under FastCGI, PHP is separated from apache.

    • Advantage: less memory usage within apache child processes
    • Advantage: less MySQL connections
      • Each instance of PHP opens a connection to MySQL, including those spawned by mod_php, which would be every apache process
  • In a 1:1 relationship, mod_php and fastCGI serve PHP with (roughly) the same resource usage — but when serving static content, PHP is skipped entirely

    • (This would result in incredible resource savings when using Boost!)
  • Downsides include:

    • Opcode caching is less effective as it is not shared across all PHP processes, just a single process
    • APC would be less effective as an alternative to memcache for the same reason - too many respawns
    • In Drupal, you don’t get the pretty file upload progress bar (but uploads work just fine)
  • Note: Under FastCGI, Apache MPM worker (threaded) is better than prefork (less memory usage per request)

Prefork+mod_php Config

  • Apache prefork is the better choice for mod_php (over MPM worker)

    • Also easier to troubleshoot runaway child process (kill process vs. all of apache)
  • When using prefork, immediately prefork a large pool of processes (as many as memory can support) to avoid churn

    • StartServers, MinSpareServers, MaxSpareServers, and MaxClients can all be set to the same amount to keep the PHP pool size constant

    • Goal: Keep as many idle PHP interpreters hanging around for as long as possible. It’s better to keep a constant size, to fill server memory

    • Set MaxRequestsPerChild to a very high number, or 0 Note: Memory leaks may occur over time w/ MaxRequestsPerChild = 0

    • Disable any unused Apache LoadModule calls if they’re not needed. Decreases memory footprint of each child process

    Example:

      StartServers 40
      MinSpareServers 40
      MaxSpareServers 40
      MaxClients 80
      MaxRequestsPerChild 20000

    i.e. Start at 40 child processes, stay at 40 even when idle, burst up to 80 in heavy load.

    LOL: “There is no prize for having a lot of free memory. ‘My server is slow, but look at all that free RAM!!!’ If you have memory, then use it!” Pro D7 Dev, Ch. 23, p. 505

Counterpoint: Memory usage in apache child processes tends to equal the largest page served by that child process, which means over time each of those MinSpareServers can become enormous. This may or may not be a good thing. Alternatively, set MaxRequestsPerChild to 2000 (or lower) to more frequently respawn while maintaining the same constant number of child processes.

Increasing MaxClients will only worsen a memory problem, as it will attempt to serve even more requests.

  • A simple calculation for MaxClients on a system that does only Drupal would be: (Total Memory - Operating System Memory - MySQL memory) / Size Per Apache process.
  • If you tune Apache well, and remove all the unneeded modules, and install a PHP op-code cache/accelerator, then you can make each Apache process take as little as 12 MB.

Consider not using Apache at all

  • Lighttpd is my personal fave, is far superior to apache especially when serving static content, follows a K.I.S.S. design principle, and if using FastCGI, has exactly the same performance serving Drupal as apache. It’s a win/win.
    • One potential gotcha (which I think is actually a win) is that Plesk doesn’t understand what Lighttpd is. It probably thinks it’s some kind of lightbulb from Ikea. (But we also hate Plesk, right? So this is a moot point, right?)

Other tuning

  • Decrease Timeout from 5 mins to 20 seconds or less
  • Minor: move .htaccess into server config files, disable searching for .htaccess

MySQL-level Tuning

MySQL query cache

  • Can be disabled by default so always check if turned on
    • Go to admin -> reports -> status report and click on MySQL version number to get a variables report
  • No such thing as optimal cache size for all cases
    • Too small, too much churn
    • Too large is wasteful and cache lookups can take longer (relatively speaking)
    • Query cache memory could be put to better use, more web processes, memcache, etc.

InnoDB (vs. MyISAM, etc.)

  • MyISAM uses table-level locking which means the entire table is blocked during a write

    • “Blocked” means that even a SELECT statement will have to wait for that write to finish!
  • InnoDB uses row-level locking which will only affect a given row during a write

    • Drupal 7 is using InnoDB by default
  • Should you convert to InnoDB for all tables?

    • Probably not for infrequently edited tables: MyISAM table reads are faster
    • Candidates for InnoDB:
      • cache_*
      • watchdog (if not using syslog)
      • sessions
      • accesslog
  • To convert a table to InnoDB: (take site offline first!)

      ALTER TABLE {tablename} TYPE='InnoDB';
  • Analyze lock contention by checking the Table_locks_immediate and Table_locks_waited status variables: SHOW STATUS LIKE 'Table%';

    Table_locks_immediate = number of times a table lock was acquired w/o waiting Table_locks_waited = number of times waiting was necessary to acquire a lock

    • If the Table_locks_waited value is high, and you’re having performance issues, consider splitting large tables into multiple smaller tables (like for cache tables, etc.)

Other MySQL tuning notes

  • Idea: MySQL Master/Slave replication and using it for read/write separation

    • Use slave mysqld as a read-only server (only allowing writes to master, which is then replicated to slave in background)
    • Potential gotcha: when the lag of replication is too long, a write may not immediately available on slave
    • Supported by Pressflow distro (and Drupal 7)
  • Analyzing slow query log

    • Set this at its lowest threshold (1 second), and leave it on all the time
    • Go through this log at least once a month and after every major release
      • Queries that were fine last month can become slow once a table gets too big
      • Queries that were fine in QA can collapse under a real world load
    • Tool for trolling through these logs is mysqlsla
  • Using MySQL profiling

    • SET PROFILING = 1; will enable profiling for that session

    • Execute queries

    • SHOW PROFILES; will show how those queries performed

    • Additional information is available, e.g.:

          SELECT state, duration, cpu_user+cpu_system AS cpu,
          block_ops_in+block_ops_out AS blocks_ops
          FROM information_schema.profiling
          WHERE query_id = 1 AND
          duration > 0.000999;

Server-level Tuning

  • Reduce the number of services that aren’t directly supporting apache and mysql

    • Does Courier-IMAP really need to be running?
    • Plesk is evil and it steals the lifeforce from newborns, I’ve seen it happen
  • The more memory available to apache and mysql, the better.

    • If apache is configured correctly, that is.

Reverse Proxy and/or Static Caching (aka Band-Aids)

Varnish

  • Varnish does not support SSL(?)
  • Not a good solution for all cases:
    • not particularly easy to deploy
    • adds a layer of complexity for admins to troubleshoot after deploy
    • use it only as a final layer after all other optimizations have been applied, just like real-world varnish on wood

Boost

  • Boost can be configured to store cache dir in a tmpfs/ramfs mount for extremely fast IO

Testing

  • XCache, analyzing trace files
  • Tracelytics

Benchmarking

Apache Bench (ab) is more of a raw-speed benchmarking tool.

Siege lets you build test cases (see the URLs File I mentioned) which try to simulate large numbers of real users browsing your site - with a list of GET and POST requests, including POST data. Rather than each concurrent user flooding the server, it inserts delays between requests for a more true-to-life effect.