a t e v a n s . c o m

(╯°□°)╯︵ <ǝlqɐʇ/>

Using Passenger, we ran into a strange problem - how to monitor and kill runaway pages which were eating up too much RAM? Our usual monitoring infrastructure, monit, didn’t have the facilities to deal with Passenger since it primarily works with processes that have statically-located PID files.

Ideally, you would want to have a stack trace and suchlike as well, so you could figure out what was killing your server without a bunch of nasty profiling tests and runs and comparative analysis.

Well, here’s a script that does that.

kill -6 $(passenger-memory-stats | grep '[1-9]...\.. MB.*Rails' | awk '{ print $1 }')

I put it into a cron that runs every minute. Here’s what that looks like:

*/1     *       *       *       *       /usr/local/bin/kill_bad_passengers

So, how’s it work? Well, wherever you’ve got Passenger installed, you can run passenger-memory-stats. This will give you a breakdown of which processes are running and how much memory they take up. We decided to kill anything taking more than a gig of RAM (hence the three dots after the character class in the grep statement up there). Kill -6 sends SIGABRT, which safely kills off the child process and logs an exception. Check the docs on that here.

Inspiration definitely came from Bill Harding (thanks!) and lots of help from fellow HeyZapper Chris. Thanks, guys!

Ideally in a Rails app, you’ll have extensive unit tests for all your model classes. Those tests will be updated and run before everyone deploys. And you get a pony, to make getting to the office easier. So, when you have a huge stack of untested models that are constantly being updated, is there a quick way to test them and make sure that at least there aren’t any syntax errors or that kind of thing? Well, here’s my 80% solution.

describe ActiveRecord::Base do
  it "should load all model classes and not throw exceptions" do
    models = []
    Dir["#{RAILS_ROOT}/app/models/*"].each do |file|
       matches = File.read(file).scan /^\s*class (\w+) < ActiveRecord::Base/
       matches.each do |match_data|
         match_data.each {|match| models << match unless models.include?(match) }
       end
    end
    models.each {|model| eval model }
  end
end

This is an Rspec test that finds all your model classes and loads them, kind of like Rails does when loaded in a production environment. That way, if there are any obvious take-down-the-site level screw-ups they’ll be pointed out to you in short order. Combine that with a continuous build server that blocks your deploys and you can get some pretty decent insurance that your server will restart successfully.

You may have to make some modifications to the regex if you’re using STI or other inheritance mechanisms. I had to put in the check for inheriting from ActiveRecord::Base in order to prevent it from trying to instantiate internal classes like custom exceptions without properly namespacing them.

Pro tip: When you’re trying to figure out whether your monit config file will wreck your monitoring service or not, your best friend is monit -t. Copy your existing config into your home directory, make any changes, and run something like monit -c yourfile -t and it will check the grammar for correctness, and all the programs you call for existence and ownership. It will even warn you if the file’s permissions are wrong! Made deploying monitoring scripts much less scary.

So, I set up cijoe for our company yesterday. It wasn’t too bad to set up; the instructions are pretty clear for how it works. Now that it’s running, we can see who broke the build, and I set up a post-build hook to notify our office chat room who broke it and link to the commit in our source browser. That’s pretty cool; but it’s still possible to take down the site if you commit something bad and deploy it before the build finishes.

Well, not anymore.

We use a shell script to wrap our capistrano deployment system, so I just added the following to the beginning of it:

RES=$(curl --write-out %{http_code} --silent --output /dev/null http://our.cijoe.server/ping)
if [ $RES -ne 200 ]; then
  echo "Can't deploy right now, check build status at http://our.cijoe.server"
  exit
fi

Now you can’t deploy until our server is done building, and you can’t deploy at all until the build passes. Better keep those tests up to date, eh?

Quick tip: Zsh uses the ^ character as a reserved character. So, if you want to do something like git reset --soft HEAD^, you have to escape the slash: git reset HEAD\^1

Found here — thanks, wereHamster

RSpec’s transactional db clearing was inadequate for us. We are using FactoryGirl rather than RSpec’s built-in mocks framework, so none of those records were getting cleared out of the database. There’s a simple way to clear out your entire MySQL database before each test, ensuring that each test has a clean slate to start with. Drop this into your spec_helper in the Spec::Runner configuration:

config.before :each do
  (ActiveRecord::Base.connection.tables - %w{schema_migrations}).each do |table_name|
    ActiveRecord::Base.connection.execute "TRUNCATE TABLE #{table_name};"
  end
end

When I first put this in, however, I got the following error on nearly every controller test:

ActiveRecord::StatementInvalid in 'MongoUtil should sum platform play counts correctly with MongoUtil.sum'
Mysql::Error: SAVEPOINT active_record_1 does not exist: ROLLBACK TO SAVEPOINT active_record_1
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/activerecord-2.3.10/lib/active_record/connection_adapters/abstract_adapter.rb:227:in `log'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/activerecord-2.3.10/lib/active_record/connection_adapters/mysql_adapter.rb:324:in `execute'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/activerecord-2.3.10/lib/active_record/connection_adapters/mysql_adapter.rb:366:in `rollback_to_savepoint'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/activerecord-2.3.10/lib/active_record/connection_adapters/abstract/database_statements.rb:167:in `transaction'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/activerecord-2.3.10/lib/active_record/transactions.rb:182:in `transaction'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/activerecord-2.3.10/lib/active_record/transactions.rb:200:in `save!'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/activerecord-2.3.10/lib/active_record/transactions.rb:208:in `rollback_active_record_state!'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/activerecord-2.3.10/lib/active_record/transactions.rb:200:in `save!'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/factory_girl-1.2.4/lib/factory_girl/proxy/create.rb:6:in `result'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/factory_girl-1.2.4/lib/factory_girl/factory.rb:323:in `run'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/factory_girl-1.2.4/lib/factory_girl/factory.rb:267:in `create'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/factory_girl-1.2.4/lib/factory_girl/factory.rb:298:in `send'
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/factory_girl-1.2.4/lib/factory_girl/factory.rb:298:in `default_strategy' 
/Users/andy/.rvm/gems/ruby-1.8.7-p302@heyzap/gems/factory_girl-1.2.4/lib/factory_girl.rb:21:in `Factory'
./spec/models/mongo_util_spec.rb:11:

I had a lot of trouble tracking down exactly what was going on here, since ActiveRecord’s own use of Savepoints is pretty well tested. Eventually I figured out that RSpec’s transaction caching and management was not friendly with making Base.connection.execute calls. Running any dirty SQL totally hosed RSpec’s savepoints, and RSpec just could not deal with that.

Of course, since we’re doing our own db cleaning now, RSpec shouldn’t need to set any save points. So, I just turned them off:

config.use_transactional_fixtures = false

Problem solved.

I set up Resque for our startup over the last few days, to put some log analysis stuff in the background. It was harder than I thought it should be, largely due to a lack of documentation for compatibility with common tools like Capistrano and daemon kickoff tools. Here’s how I did it.

Step 1: Install Redis.

This part’s easy on OS X:

brew install redis

I already had Redis running for some other tasks on the server, so I just used the existing install for Resque. I’ll get to that in a minute.

Step 2: Add Resque to your Gemfile.

You are using Bundler, right? If not, you should be. I also decided to use Resque Retry, a plugin for Resque that allows it to transparently retry failed jobs, with advanced retry strategies like exponential back off.

gem "resque"
gem "resque-retry"

Step 3: Drop in the Resque initializer.

For us, this was really simple, since I already had a Redis initializer. But the documentation for initializing Resque is pretty simple. Just assign Resque.redis to either a config hash, or to an existing Redis object. Ours looks like this.

require 'resque'
Resque.redis = REDIS

I also implemented the built-in support for Hoptoad, which necessitates an extra require and a small configuration block, but it was no big deal.

Step 4: Add Resque rake tasks.

I just dropped resque.rake into lib/tasks, and this is all there is to it:

require 'resque/tasks'
require 'resque_scheduler/tasks'

This will give you all the rake tasks you need to run Resque items.

Step 5: Write a Resque worker.

This part wasn’t too bad, either:

require 'resque-retry'

class ResqueTestWorker
  extend Resque::Plugins::ExponentialBackoff
  @queue = :resque_test_queue

  def self.perform(*args)
    MONGO['test_queue'].insert(args)
  end
end

All I’re doing in ours (since this is a proof-of-concept experiment) is taking some arguments to a function and throwing them into a Mongo collection. Later, I can run some queries on the mongo collection and make sure it’s what I expect.

Step 6: Configure backend server.

Resque ships with a nice little Sinatra app to keep track of workers, jobs, and failures. It also has some additions from Resque Retry. As far as I know, this server uses the Redis defaults when launched, and that’s not going to work on staging and production environments. So, I have to do a little configuration. I put together a Rackup file to configure a simple thin server to run the Resque backend on. Here’s what it looks like:

#!/usr/bin/env ruby
require 'logger'
require 'yaml'
require 'resque/server'
require 'resque-retry'
require 'resque-retry/server'

use Rack::ShoIxceptions

redis_config = YAML.load_file(File.expand_path(File.dirname(__FILE__) + '/redis.yml'))[ENV['RESQUE_THIN_ENV']]
rc = {}; redis_config.each{|k,v| rc[k.to_sym] = v } # symbolize keys from yaml
Resque.redis = Redis.new(rc)

# Set the AUTH env variable to your basic auth password to protect Resque.
AUTH_PASSWORD = "example_password"
if AUTH_PASSWORD
  Resque::Server.use Rack::Auth::Basic do |username, password|
    password == AUTH_PASSWORD
  end
end

run Resque::Server.new

As you can see, it loads our existing Redis.yml without having access to the Rails initializer for Redis. I also include all the bells and whistles from Retry, and at the bottom run the Rack app included in all that.

Step 7: Process management for deploy servers

Here was one I didn’t expect – there’s very little process management in Resque! It includes a Rake task to start a single worker or multiple workers, but I didn’t find them useful outside the development environment. When deployed with Capistrano, these tasks didn’t survive the single-command non-login shell that Cap uses for everything. Even with nohup and &, the rake process was still killed as soon as Capistrano logged out.

I found this post from Thomas Mango on how to start and restart with God, but I’re not using God right now (that’s a project for another day). If you use Monit there’s also the example Monit script, but since our Monit scripts are not currently managed via chef or git, I wasn’t comfortable messing with those just yet.

I have some legacy workers from DelayedJob that uses Daemon-Spawn, a pretty simple library for running things in the background. I also glanced at Daemon-Controller, but since I had examples in the codebase for Daemon-Spawn I decided it wasn’t worth switching right now.

Daemon-Spawn has PID-file and logging handled, and can also start multiple instances of the same process, so it does everything I needed it to right now. I needed two daemons to run – the Resque Scheduler to handle retrying failed jobs, and one to run a couple workers. Since I’re just implementing a proof-of-concept, I left each worker to handle all available queues, but you could conceivably create a daemon for each different queue you want to handle, and specify how many workers each queue should get.

Anyway, here’s what my scheduler daemon looked like.

#!/usr/bin/env ruby

ENV['RAILS_ENV'] ||= 'development'
require File.expand_path('../../config/boot',  __FILE__)
require Rails.root.join("config", "environment")

class ResqueSchedulerDaemon < DaemonSpawn::Base
  def start(args)
    Resque::Scheduler.verbose = true
    Resque::Scheduler.run
  end

  def stop
    Resque::Scheduler.shutdown
  end
end

ResqueSchedulerDaemon.spawn!({
  :log_file => File.join(RAILS_ROOT, "log", "resque_scheduler.log"),
  :pid_file => File.join(RAILS_ROOT, 'tmp', 'pids', 'resque_scheduler.pid'),
  :sync_log => true,
  :working_dir => RAILS_ROOT,
  :singleton => true
})

And here’s the worker daemon:

#!/usr/bin/env ruby

ENV['RAILS_ENV'] ||= 'development'
require File.expand_path('../../config/boot',  __FILE__)
require Rails.root.join("config", "environment")

class ResqueWorkerDaemon < DaemonSpawn::Base
  def start(args)
    @worker = Resque::Worker.new('*') # Specify which queues this worker will process
    @worker.verbose = 1 # Logging - can also set vverbose for 'very verbose'
    @worker.work
  end

  def stop
    @worker.try(:shutdown)
  end
end

ResqueWorkerDaemon.spawn!({
  :processes => 5,
  :log_file => File.join(RAILS_ROOT, "log", "resque_worker.log"),
  :pid_file => File.join(RAILS_ROOT, 'tmp', 'pids', 'resque_worker.pid'),
  :sync_log => true,
  :working_dir => RAILS_ROOT,
  :singleton => true
})

At first I tried doing a bunch of command-line fu using system, but that was wrong-headed. Looking into the code for the Rake tasks that resque-scheduler and resque provide, starting these background processes is blindingly simple.

Step 8: Add daemons and server to Capistrano

With those daemons crafted, it’s just a matter of adding them to deploy.rb (or another file that Capistrano loads, as per your preference). I added a Resque namespace in Capistrano and put each of the tasks in after ‘deploy:’ calls, so I had the following setup.

after "deploy:stop",        "resque:stop"
after "deploy:start",       "resque:start"
after "deploy:restart",     "resque:restart"

An then the associated tasks:

namespace :resque do
  def rails_env
    fetch(:rails_env, false) ? "RAILS_ENV=#{fetch(:rails_env)}" : ''
  end

  desc "Start resque scheduler, workers"
  task :start, :roles => :app, :only => { :jobs => true } do
    run "cd #{current_path};#{rails_env} daemons/resque_scheduler start"
    run "cd #{current_path};#{rails_env} daemons/resque_worker start"
    run "cd #{current_path};RESQUE_THIN_ENV=#{stage} bundle exec thin -d -P /tmp/thin.pid -p 9292 -R config/resque_config.ru start; true"
  end

  # test commit for nohup
  desc "Stop resque workers"
  task :stop, :roles => :app, :only => { :jobs => true } do
    run "cd #{current_path};#{rails_env} daemons/resque_scheduler stop"
    run "cd #{current_path};#{rails_env} daemons/resque_worker stop"
    run "cd #{current_path};RESQUE_THIN_ENV=#{stage} bundle exec thin -d -P /tmp/thin.pid -p 9292 -R config/resque_config.ru stop; true"
  end

  desc "Restart resque workers"
  task :restart, :roles => :app, :only => { :jobs => true } do
    run "cd #{current_path};#{rails_env} daemons/resque_scheduler restart"
    run "cd #{current_path};#{rails_env} daemons/resque_worker restart"
    [:stop, :start].each { |cmd| run "cd #{current_path};RESQUE_THIN_ENV=#{stage} bundle exec thin -d -P /tmp/thin.pid -p 9001 -R config/resque_config.ru #{cmd}; true"}
  end
end

Since thin manages its own daemon, you have to be a bit more explicit with what to do with it. Specifying the PID file to be outside the Rails code path is necessary so that you’re not trying to determine whether thin’s PID file is in the current or previous release. After all, there should be only one.

Also, I have a Ib server running on port 80 already, and since the Resque backend is only for developers to tinker with and has HTTPauth on it, I figured it’s fine to let it reside on another port. You can always proxy it with Nginx or a CNAME or something.

I had to call stop / start on thin, since the restart action was timing out waiting for the PID file to get deleted. I think this is an issue with Resque-web disconnecting from Resque after shutdown is complete, and only erasing the PID file then. Probably not a bug in thin itself.

Finally, if I don’t have a pretty backend for Resque, it’s not necessary to roll back the deployment – that can be noted and dealt with. So, instead of taking the output of thin’s process, I just return true in any case.

Step 9: Deploy!

You may have to call cap production resque:start before you can do your normal deploy cycle – our deploy path calls the restart actions by default, and these actions got quite irritated if there weren’t any processes running to begin with. Just calling resque:start from my local machine fixed that.

Good luck! If you have any questions, improvements, or trouble, please post in the comments.


Took a long time to get all that together. Discovering the myriad of ways NOT to do all this was quite an explorative process. And there’s still lots of improvements to make – managing thin with Capistrano is hackey at best. Also, I really should get God or Monit in on the action, since I want to seriously ensure that workers are running all the time. But this does get our Resque proof out, and I can compare execution times to our existing synchronous methods.

Super-quick, clean implementation to determine if a hash[:key1][“key2”][“key3”] exists for use in conditionals.

[:key1,:key2].inject(hash){|h,k| h && h[k]}

Cred goes to SO user taw. Thanks!

If you need to use any functions from your Rails app in your Rake tasks, you have to declare the task with this format:

task :name => :environment do
  ...
end

Forgot about that, or just didn’t notice it or something and ended up asking this StackOverflow question. Gosh don’t I feel silly.

If you fail to do that, you get crazy “uninitialized constant” errors like this:

rake aborted!
uninitialized constant Object::HelloClass
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:2503:in `const_missing'
/Users/name/Sites/Rails/rake_test/lib/tasks/testclass.rake:5:in `block (2 levels) in '
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:636:in `call'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:636:in `block in execute'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:631:in `each'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:631:in `execute'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:597:in `block in invoke_with_call_chain'
/Users/name/.rvm/rubies/ruby-1.9.2-p0/lib/ruby/1.9.1/monitor.rb:201:in `mon_synchronize'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:590:in `invoke_with_call_chain'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:583:in `invoke'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:2051:in `invoke_task'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:2029:in `block (2 levels) in top_level'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:2029:in `each'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:2029:in `block in top_level'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:2068:in `standard_exception_handling'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:2023:in `top_level'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:2001:in `block in run'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:2068:in `standard_exception_handling'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/lib/rake.rb:1998:in `run'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/gems/rake-0.8.7/bin/rake:31:in `'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/bin/rake:19:in `load'
/Users/name/.rvm/gems/ruby-1.9.2-p0@global/bin/rake:19:in `'
Mastodon