In Rails, it’s easy to get a bunch of records from your database if you have their IDs:

1
Person.where(id: [1, 2, 3]).map(&:id) => [1, 2, 3]

But what if you wanted to get the records back in a different order? Maybe your search engine returns the most relevant IDs first. How do you keep your records in that order?

You could try where again:

1
Person.where(id: [2, 1, 3]).map(&:id) => [1, 2, 3]

But that doesn’t work at all. So how do you get your records back in the right order?

The compatible way: case statements

Just like Ruby, SQL supports case...when statements.

You don’t see it too often, but case...when statements can almost act like hashes. You can map one value to another:

1
2
3
4
5
case :b
when :a then 1
when :b then 2
when :c then 3
end # => 2

That case statement kind of looks like a hash:

1
2
3
4
5
{
  :a => 1,
  :b => 2,
  :c => 3
}[:b] # => 2

So, you have a way to map keys to order they should appear in. And your database can sort and return your results by that arbitrary order.

Knowing that, you could put your IDs and their position into a case statement, and use it in a SQL order clause.

So if you wanted your objects returned in the order [2, 1, 3], your SQL could look like this:

1
2
3
4
5
6
7
SELECT * FROM people
  WHERE id IN (1, 2, 3)
  ORDER BY CASE id
    WHEN 2 THEN 0
    WHEN 1 THEN 1
    WHEN 3 THEN 2
    ELSE 3 END;

That way, your records are returned in the right order. The CASE transforms each ID into the order it should be returned in.

Of course, that looks ridiculous. And you could imagine how annoying a clause like that would be to build by hand.

But you don’t have to build it by hand. That’s what Ruby’s for:

lib/extensions/active_record/find_by_ordered_ids.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
module Extensions::ActiveRecord::FindByOrderedIds
  extend ActiveSupport::Concern
  module ClassMethods
    def find_ordered(ids)
      order_clause = "CASE id "
      ids.each_with_index do |id, index|
        order_clause << "WHEN #{id} THEN #{index} "
      end
      order_clause << "ELSE #{ids.length} END"
      where(id: ids).order(order_clause)
    end
  end
end

ActiveRecord::Base.include(Extensions::ActiveRecord::FindByOrderedIds)

Person.find_ordered([2, 1, 3]) # => [2, 1, 3]

Exactly how we wanted it!

A cleaner, MySQL-specific way

If you use MySQL, there’s a cleaner way to do this. MySQL has special ORDER BY FIELD syntax:

1
2
3
SELECT * FROM people
WHERE id IN (1, 2, 3)
ORDER BY FIELD(id, 2, 1, 3);

You could also generate that from Ruby:

lib/extensions/active_record/find_by_ordered_ids.rb
1
2
3
4
5
6
7
8
9
10
module Extensions::ActiveRecord::FindByOrderedIds
  extend ActiveSupport::Concern
  module ClassMethods
    def find_ordered(ids)
      where(id: ids).order("FIELD(id, #{ids.join(",")})")
    end
  end
end

ActiveRecord::Base.include(Extensions::ActiveRecord::FindByOrderedIds)

So, if you’re using MySQL, and not too worried about compatibility, this is a good way to go. It’s a lot easier to read as those statements fly through your logs.


When you want to display records in a specific, arbitrary order, you don’t need to sort them in Ruby. With a little code snippet, you can let the database do what it’s good at: finding, sorting, and returning data to your app.

You’re about to check in your next small feature, so you kick off a full integration test run. You wait, and wait, as the dots fill your screen, until…

......FF....

:-(

You still have a few minutes before your tests finish running. But if you quit the test run early, you’ll have no idea which tests failed.

Do you really have to wait for the entire run to finish before you can see those failures?

Ctrl-T to the rescue!

If you’re using a Mac, there’s a way to see your test failures early:

Hit Ctrl-T while your tests are running.

When you do, you’ll see which test case is currently running, and how long it’s been running for. If any tests have failed so far, you’ll also see those failures, so you can get a head start on fixing them before your next run!

This is also really handy for debugging tests that just hang. Ctrl-T will tell you which test is trying to run, so you can isolate just that one test and fix it.

Finally, I’ve built a habit of hitting Ctrl-T anytime a test takes a noticeably long time (say, a second or longer) to finish. It’s pointed me to plenty of slow tests that I need to make faster.

How does Ctrl-T work?

On a Mac, Ctrl-T sends a message, or signal, called INFO, to whichever program is running:

signal_test.rb
1
2
3
4
puts "Starting..."
trap("INFO") { puts "INFO triggered!" }

loop { print "."; sleep 0.1}
1
2
3
4
5
6
7
8
9
10
11
12
~/Source jweiss$ ruby signal_test.rb
Starting...
........^Tload: 7.14  cmd: ruby 6121 running 0.10u 0.08s
INFO triggered!
.......^Tload: 7.14  cmd: ruby 6121 running 0.10u 0.08s
INFO triggered!
................^Tload: 11.77  cmd: ruby 6121 running 0.10u 0.08s
.INFO triggered!
......^Csignal_test.rb:5:in `sleep': Interrupt
 from signal_test.rb:5:in `block in <main>'
  from signal_test.rb:5:in `loop'
 from signal_test.rb:5:in `<main>'

Minitest knows about INFO, and responds to it by printing information about the test run:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
~/Source/rails/activesupport[master] jweiss$ be rake
/usr/local/Cellar/ruby/2.2.0/bin/ruby -w -I"lib:test"  "/usr/local/Cellar/ruby/2.2.0/lib/ruby/2.2.0/rake/rake_test_loader.rb" "test/**/*_test.rb"
Run options: --seed 33445

# Running:

.................F........^Tload: 1.62  cmd: ruby 29646 running 4.37u 1.40s
Current results:


  1) Failure:
CleanLoggerTest#test_format_message [/Users/jweiss/Source/rails/activesupport/test/clean_logger_test.rb:13]:
Expected "error\n" to not be equal to "error\n".



Current: DigestUUIDExt#test_invalid_hash_class 0.02s
............................

Pretty nice!

Knowing that this is possible, you might think of ways other apps could handle INFO:

  • Rails could display the currently running controller action or some performance stats.
  • Sidekiq could tell you what each worker is doing, so you could see where they get stuck.

And Sidekiq actually used to use INFO to print a backtrace of each thread it ran. But because INFO isn’t supported on Linux, Sidekiq switched to a different signal. Unfortunately, that signal can’t be triggered by a keyboard shortcut the way INFO can.

Because INFO isn’t available on Linux (and some might say that using INFO this way isn’t totally right, anyway), this behavior isn’t as widespread as it could be.

Still, it’s a little bit of extra help that could be useful in a wide range of situations. If you’re building an app, it’s worth thinking about what kind of status messages you could display on-demand to people who are interested.

You know that performance is a feature. And a lot of performance problems can be found and fixed during development.

But what about those slowdowns that only show up in production? Do you have to add log messages to every single line of code? That would just slow things down even more! Or do you ship tons of tiny “maybe this fixes it” commits to see what sticks?

You don’t have to ruin your code to analyze it. Instead, try rbtrace-ing it.

Trace your running Ruby app

With rbtrace, you can detect performance problems, run code inside another Ruby process, and log method calls without having to add any code. Just add gem "rbtrace" to your Gemfile.

I learned about rbtrace from Sam Saffron’s amazing post about debugging memory leaks in Ruby (which you should really check out, if you haven’t already).

In that post, Sam used rbtrace to see all of the objects a process used:

1
bundle exec rbtrace -p $SIDEKIQ_PID -e 'Thread.new{GC.start;require "objspace";io=File.open("/tmp/ruby-heap.dump", "w"); ObjectSpace.dump_all(output: io); io.close}'

This is awesome. But there’s a whole lot more you can do.

What can you do with rbtrace?

Ever wanted to see the SQL statements you’re running in production (and how long they took)?

1
2
3
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID --methods "ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#execute_and_clear(sql)"
*** attached to process 7897
ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#execute_and_clear(sql="SELECT  \"articles\".* FROM \"articles\" WHERE \"articles\".\"id\" = $1 LIMIT 1") <0.002631>

All method calls that take longer than 2 seconds?

1
2
3
4
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID --slow 2000
*** attached to process 8154
    Integer#times <2.463761>
        ArticlesController#create <2.558673>

Do you want to know every time a certain method gets called?

1
2
3
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID --methods "ActiveRecord::Persistence#save"
*** attached to process 8154
ActiveRecord::Persistence#save <0.010964>

See which threads your app is running?

1
2
3
4
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID -e "Thread.list"
*** attached to process 8154
>> Thread.list
=> [#<Thread:0x007ff4fcc9a8a8@/usr/local/lib/ruby/gems/2.2.0/gems/puma-2.6.0/lib/puma/server.rb:269 sleep>, #<Thread:0x007ff4fcc9aa10@/usr/local/lib/ruby/gems/2.2.0/gems/puma-2.6.0/lib/puma/thread_pool.rb:148 sleep>, #<Thread:0x007ff4fcc9ab50@/usr/local/lib/ruby/gems/2.2.0/gems/puma-2.6.0/lib/puma/reactor.rb:104 sleep>, #<Thread:0x007ff4f98c0410 sleep>]

Yep, with -e you can run Ruby code inside your server:

1
2
3
4
~/Source/testapps/rbtrace jweiss$ rbtrace -p $RAILS_PID -e "ActiveRecord::Base.connection_config"
*** attached to process 8154
>> ActiveRecord::Base.connection_config
=> {:adapter=>"postgresql", :pool=>5, :timeout=>5000, :database=>"rbtrace_test"}

Yeah, OK, now I’m a little scared. But that’s still very cool. (And only users with permission to mess with the process can rbtrace it, so it’s probably OK).


rbtrace gives you a ton of tools to inspect your Ruby processes in staging and production. You can see how your processes are using (or abusing) memory, trace slow function calls, and even execute Ruby code.

You don’t have to create tons of test commits and log messages to fix problems. You can just hop on to the server, get some data, and hop back out. And even if I’m not totally comfortable using it in production yet, I’m sure it’ll even help out in our test environments.

How about you? What could you use rbtrace for?

You’ve probably seen this pattern before. A method has an options hash as its last argument, which holds extra parameters:

1
2
3
4
5
6
def hello_message(name_parts = {})
  first_name = name_parts.fetch(:first_name)
  last_name = name_parts.fetch(:last_name)

  "Hello, #{first_name} #{last_name}"
end

Unfortunately, you need to extract those parameters out of the hash. And that means there’s a lot of setup to wade through before you get to the good part.

But if you changed this method to use keyword arguments in Ruby 2.0+, you wouldn’t have to pull :first_name and :last_name out of your hash. Ruby does it for you:

1
2
3
def hello_message(first_name:, last_name:)
  "Hello, #{first_name} #{last_name}"
end

Even better, if your app uses Ruby 1.9+ hash syntax, your methods can use keyword arguments without changing those methods’ callers:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def hello_message_with_an_options_hash(name_parts = {})
  first_name = name_parts.fetch(:first_name)
  last_name = name_parts.fetch(:last_name)

  "Hello, #{first_name} #{last_name}"
end

def hello_message_with_keyword_arguments(first_name:, last_name:)
  "Hello, #{first_name} #{last_name}"
end

hello_message_with_an_options_hash(first_name: "Justin", last_name: "Weiss")

hello_message_with_keyword_arguments(first_name: "Justin", last_name: "Weiss")

See? Those arguments are identical!

Pushing keyword argument syntax one step too far

What if you haven’t switched to the new Hash syntax, though? You could convert all your code. But, at least in Ruby 2.2.1, the old hash syntax works just fine with keyword arguments:

1
2
irb(main):007:0> hello_message_with_keyword_arguments(:first_name => "Justin", :last_name => "Weiss")
=> "Hello, Justin Weiss"

Nice! What about passing a hash object, instead of arguments?

1
2
3
irb(main):008:0> options = {:first_name => "Justin", :last_name => "Weiss"}
irb(main):009:0> hello_message_with_keyword_arguments(options)
=> "Hello, Justin Weiss"

Whoa. What if we want to mix a hash and keyword arguments?

1
2
3
4
irb(main):010:0> options = {last_name: "Weiss"}
irb(main):011:0> hello_message_with_keyword_arguments(first_name: "Justin", options)
SyntaxError: (irb):11: syntax error, unexpected ')', expecting =>
 from /usr/local/bin/irb:11:in `<main>'

OK. I guess we took that one step too far. To fix this, you could use Hash#merge to build a hash you could pass in on its own. But there’s a better way.

If you were using regular arguments instead of keyword arguments, you could splat arguments from an Array, using *:

1
2
3
4
5
6
def generate_thumbnail(name, width, height)
  # ...
end

dimensions = [240, 320]
generate_thumbnail("headshot.jpg", *dimensions)

But is there a way to splat keyword arguments into an argument list?

It turns out there is: **. Here’s how you’d fix that broken example with **:

1
2
3
irb(main):010:0> options = {last_name: "Weiss"}
irb(main):011:0> hello_message_with_keyword_arguments(first_name: "Justin", **options)
=> "Hello, Justin Weiss"

And if you’re really crazy, you can mix regular arguments, keyword arguments, and splats:

1
2
3
4
5
6
7
8
def hello_message(greeting, time_of_day, first_name:, last_name:)
  "#{greeting} #{time_of_day}, #{first_name} #{last_name}!"
end

args = ["Morning"]
keyword_args = {last_name: "Weiss"}

hello_message("Good", *args, first_name: "Justin", **keyword_args) # => "Good Morning, Justin Weiss!"

Of course, if you find yourself in the situation where that’s necessary, you probably made a mistake a lot earlier!

Capture keyword arguments the easy way

Do you know how you can turn all your method arguments into an array using *?

1
2
3
4
5
def argument_capturing_method(*args)
  args
end

argument_capturing_method(1, 2, 3) # => [1, 2, 3]

This also works with keyword arguments. They’re converted to a hash, and show up as the last argument of your args array:

1
argument_capturing_method(1, 2, 3, key: "value") # => [1, 2, 3, {:key=>"value"}]

But args.last[:key] isn’t the best way to read keyword arguments grabbed this way. Instead, you can use the new ** syntax to get the keyword arguments by themselves:

1
2
3
4
5
def dual_argument_capturing_method(*args, **keyword_args)
  {args: args, keyword_args: keyword_args}
end

dual_argument_capturing_method(1, 2, 3, key: "value") # => {:args=>[1, 2, 3], :keyword_args=>{:key=>"value"}}

With this syntax, you can access the first regular argument with args[0] and the :key keyword argument with keyword_args[:key].

… Of course, now we’re back to options hashes.


Keyword arguments are great for removing a ton of parameter extraction boilerplate from your code. And you might not even have to change any of your code to take advantage of them.

But when you write more generic methods, there are some new techniques you’ll have to learn to handle keyword arguments well. You might not have to use those techniques very often. But when you do, this power and flexibility will be there, waiting for you.

When you go bugfixing, the quick, obvious change isn’t always the best one. And the code in front of you is never the whole story. To go beyond the easy fix, you have to know why certain decisions were made. You have to understand the history behind the code. And there are three great ways to learn what you need to know to confidently change code.

git blame

With the help of git blame, you can trace through every version of every line of code in a project, all the way back to when it was written.

For example, say you were looking at ActiveJob’s queue_name.rb file, and you wanted to know what this queue_name_delimiter attribute was all about:

activejob/lib/active_job/queue_name.rb
1
2
3
4
5
6
7
included do
  class_attribute :queue_name, instance_accessor: false
  class_attribute :queue_name_delimiter, instance_accessor: false

  self.queue_name = default_queue_name
  self.queue_name_delimiter = '_' # set default delimiter to '_'
end

You could run git blame on it:

1
2
3
4
5
6
7
8
$ git blame queue_name.rb

...
da6a86f8 lib/active_job/queue_name.rb           (Douwe Maan               2014-06-09 18:49:14 +0200 34)     included do
1e237b4e activejob/lib/active_job/queue_name.rb (Cristian Bica            2014-08-25 17:34:50 +0300 35)       class_attribute :queue_name, instance_accessor: false
11ab04b1 activejob/lib/active_job/queue_name.rb (Terry Meacham            2014-09-23 15:51:44 -0500 36)       class_attribute :queue_name_delimiter, instance_accessor: false
11ab04b1 activejob/lib/active_job/queue_name.rb (Terry Meacham            2014-09-23 15:51:44 -0500 37)
...

And for each line, in order, you’ll see:

  • The revision that changed that line most recently (11ab04b1, for example),
  • The name of the author of that commit,
  • And the date the change was made.

To learn more about that line of code, you’ll need the revision number. Pass the id (that 11ab04b1 part) to git show or git log:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ git show 11ab04b1

commit 11ab04b11170253e96515c3ada6f2566b092533a
Author: Terry Meacham <zv1n.fire@gmail.com>
Date:   Tue Sep 23 15:51:44 2014 -0500

    Added queue_name_delimiter attribute.

    - Added ActiveJob::Base#queue_name_delimiter to allow for
      developers using ActiveJob to change the delimiter from the default
      ('_') to whatever else they may be using (e.g., '.', '-', ...).

    - Updated source guide to include a blurb about the delimiter.

diff --git a/activejob/lib/active_job/queue_name.rb b/activejob/lib/active_job/queue_name.rb
index d167617..6ee7142 100644
...

Cool! You get to learn a little more about the change, why it’s useful, and see the part of the Rails Guide about it that you might have missed before.

Here, we got pretty lucky. We found the information we were looking for right away. But git blame only shows you the most recent time that line changed. And sometimes, you won’t find what you’re looking for until you go two or three commits back.

To see earlier commits, you can call git blame again. But this time, pass the revision before the commit git blame found. (In git, you can say “the commit before this other commit” by putting a ^ after the revision, like 11ab04b1^):

1
2
3
4
5
6
7
8
9
10
$ git blame 11ab04b1^ queue_name.rb

...
da6a86f8 lib/active_job/queue_name.rb           (Douwe Maan               2014-06-09 18:49:14 +0200 33)     included do
1e237b4e activejob/lib/active_job/queue_name.rb (Cristian Bica            2014-08-25 17:34:50 +0300 34)       class_attribute :queue_name, instance_accessor: false
94ae25ec activejob/lib/active_job/queue_name.rb (Cristian Bica            2014-08-15 23:32:08 +0300 35)       self.queue_name = default_queue_name
...

$ git blame 1e237b4e^ queue_name.rb
... and so on ...

That’s pretty mind-numbing, though.

Instead, explore your text editor. Most editors make tracing through history with git blame easy. For example, in Emacs, after git blame-ing your code, place your cursor on a line. Then, you can press a to step back through each change that affected that line, and use l and D to see more detail about a commit.

Does your team refer to issue numbers and pull requests in your commit messages? If so, git blame can easily take you from a line of code, to a commit, to the discussion about that commit. And that discussion is where you’ll find all the really good stuff.

Github Issues

A little while ago, I noticed that Rails 4.2 removed respond_with. In the docs, it’s clear that it was removed, but I didn’t understand why.

There’s a ton of great knowledge hidden behind GitHub’s issue search box. If you want to know why a feature was removed, there’s no better place to learn than the discussion the team had while they decided to remove it.

So, if you search Rails’ GitHub repo for respond_with, you’ll find some interesting threads about respond_with. If you’re trying to find out why it was removed, you’ll probably land on this thread. Unfortunately, it describes how it was removed, but not why.

Later on in that thread, though, you’ll find a comment that points to the actual discussion about removing respond_with. That’s where the good stuff is!

As with git blame, you might not find exactly what you’re looking for right away. You’ll have to follow references, read comments, and click on links. But GitHub’s issue search will get you started in the right place. And with a little curiosity and a sense of exploration, you’ll learn what you came for.

Ask questions

Unfortunately, not all knowledge about a project can be found in its history, issues, and pull requests. Not everything is written down.

So if you can find the person that originally wrote the code, ask about it. You’ll discover the dev culture and ideas that led to a decision. You’ll learn some history about the project that might never have been recorded. And you’ll hear about paths that were never taken, which might actually make sense to try now.

The easy way can be dangerous

Sometimes, a fix just seems so easy. All you have to do is rescue this one exception in this one place, and then you can go home!

But it’s dangerous to change code you don’t understand.

So when a line of code seems off, or a decision seems weird, or a call seems useless, put on your archeologist’s hat. Learn as much as you can, however you can, about that code. That’s how you’ll make your code change intentionally, and fix the problem at its root.

What if your Rails app couldn’t tell who was visiting it? If you had no idea that the same person requested two different pages? If all the data you stored vanished as soon as you returned a response?

That might be fine for a mostly static site. But most apps need to be able to store some data about a user. Maybe it’s a user id, or a preferred language, or whether they always want to see the desktop version of your site on their iPad.

session is the perfect place to put this kind of data. Little bits of data you want to keep around for more than one request.

Sessions are easy to use:

1
session[:current_user_id] = @user.id

But they can be a little magical. What is a session? How does Rails know to show the right data to the right person? And how do you decide where you keep your session data?

What is a session?

A session is just a place to store data during one request that you can read during later requests.

You can set some data in a controller action:

app/controllers/sessions_controller.rb
1
2
3
4
5
def create
  # ...
  session[:current_user_id] = @user.id
  # ...
end

And read it in another:

app/controllers/users_controller.rb
1
2
3
4
def index
  current_user = User.find_by_id(session[:current_user_id])
  # ...
end

It might not seem that interesting. But it takes coordination between your user’s browser and your Rails app to make everything connect up. And it all starts with cookies.

When you request a webpage, the server can set a cookie when it responds back:

1
2
3
~ jweiss$ curl -I http://www.google.com | grep Set-Cookie

Set-Cookie: NID=67=J2xeyegolV0SSneukSOANOCoeuDQs7G1FDAK2j-nVyaoejz-4K6aouUQtyp5B_rK3Z7G-EwTIzDm7XQ3_ZUVNnFmlGfIHMAnZQNd4kM89VLzCsM0fZnr_N8-idASAfBEdS; expires=Wed, 16-Sep-2015 05:44:42 GMT; path=/; domain=.google.com; HttpOnly

Your browser will store those cookies. And until the cookie expires, every time you make a request, your browser will send the cookies back to the server:

1
2
3
4
5
6
7
...
> GET / HTTP/1.1
> User-Agent: curl/7.37.1
> Host: www.google.com
> Accept: */*
> Cookie: NID=67=J2xeyegolV0SSneukSOANOCoeuDQs7G1FDAK2j-nVyaoejz-4K6aouUQtyp5B_rK3Z7G-EwTIzDm7XQ3_ZUVNnFmlGfIHMAnZQNd4kM89VLzCsM0fZnr_N8-idASAfBEdS; expires=Wed, 16-Sep-2015 05:44:42 GMT; path=/; domain=.google.com; HttpOnly
...

Many cookies just look like gibberish. And they’re supposed to. Because the information inside the cookie isn’t meant for the user. Your Rails app is in charge of figuring out what a cookie means. Your app set it, so your app can read it.

What does this have to do with a session?

So, you have a cookie. You put data in during one request, and you get that same data in the next. What’s the difference between that and a session?

By default, in Rails, there isn’t much of a difference. Rails does some work with the cookie to make it more secure. But besides that, it works the way you’d expect. Your Rails app puts some data into the cookie, the same data comes out of the cookie. If this was all there was, there’d be no reason to distinguish sessions from cookies.

But cookies aren’t always the right answer for session data:

  • You can only store about 4kb of data in a cookie.

    This is usually enough, but sometimes it’s not.

  • Cookies are sent along with every request you make.

    Big cookies mean bigger requests and responses, which mean slower websites.

  • If you accidentally expose your secret_key_base, your users can change the data you’ve put inside your cookie.

    When this includes things like current_user_id, anyone can become whichever user they want!

  • Storing the wrong kind of data inside a cookie can be insecure.

If you’re careful, these aren’t big problems.

But when you can’t store your session data inside a cookie for one of these reasons, Rails has a few other places to keep your sessions:

Alternative session stores

All of the session stores that aren’t the cookie session store work in pretty much the same way. But it’s easiest to think about using a real example.

If you were keeping track of your sessions with ActiveRecord:

  1. When you call session[:current_user_id] = 1 in your app, and a session doesn’t already exist:

  2. Rails will create a new record in your sessions table with a random session ID (say, 09497d46978bf6f32265fefb5cc52264).

  3. It’ll store {current_user_id: 1} (Base64-encoded) in the data attribute of that record.

  4. And it’ll return the generated session ID, 09497d46978bf6f32265fefb5cc52264, to the browser using Set-Cookie.

The next time you request a page,

  1. The browser sends that same cookie to your app, using the Cookie: header.

    (like this: Cookie: _my_app_session=09497d46978bf6f32265fefb5cc52264;
    path=/; HttpOnly)

  2. When you call session[:current_user_id]:

  3. Your app grabs the session ID out of your cookie, and finds its record in the sessions table.

  4. Then, it returns current_user_id out of the data attribute of that record.

Whether you’re storing sessions in the database, in Memcached, in Redis, or wherever else, they mostly follow this same process. Your cookie only contains a session ID, and your Rails app looks up the data in your session store using that ID.

Cookie store, cache store, or database store?

When it works, storing your sessions in cookies is by far the easiest way to go. It doesn’t need any extra infrastructure or setup.

But if you need to move beyond the cookie session store, you have two options:

Store sessions in a database, or store them in your cache.

Storing sessions in the cache

You might already be using something like Memcache to cache your partials or data. If so, the cache store is the second-easiest place to store session data, since it’s already set up.

You don’t have to worry about your session store growing out of control, because older sessions will automatically get kicked out of the cache if it gets too big. And it’s fast, because your cache will most likely be kept in memory.

But it’s not perfect:

  • If you actually care about keeping old sessions around, you probably don’t want them to get kicked out of the cache.

  • Your sessions and your cached data will be fighting for space. If you don’t have enough memory, you could be facing a ton of cache misses and early expired sessions.

  • If you ever need to reset your cache (say you upgraded Rails and your old cached data is no longer accurate), there’s no way to do that without expiring everyone’s sessions.

Still, this is how we store session data at Avvo, and it’s worked well for us so far.

Storing sessions in the database

If you want to keep your session data around until it legitimately expires, you probably want to keep it in some kind of database. Whether that’s Redis, ActiveRecord, or something else.

But database session storage also has drawbacks:

  • With some database stores, your sessions won’t get cleaned up automatically.

    So you’ll have to go through and clean expired sessions on your own.

  • You have to know how your database will behave when it’s full of session data.

    Are you using Redis as your session store? Will it try to keep all your session data in memory? Does your server have enough memory for that, or will it start swapping so badly you won’t be able to ssh in to fix it?

  • You have to be more careful about when you create session data, or you’ll fill your database with useless sessions.

    For example, if you accidentally touch the session on every request, googlebot could create hundreds of thousands of useless sessions. And that would be a bad time.

Most of these problems are pretty rare. But you should still be aware of them.

So how should you store your sessions?

If you’re pretty sure you won’t run into any of the cookie store’s limitations, use it. It doesn’t need much setup, and isn’t a headache to maintain.

Storing sessions in the cache vs. the database is more a judgement call about how bad it would be to expire a session early. I treat session data as pretty temporary, so the cache store works well for me. So I usually try cookie first, then cache, then database.

But how about you? How do you store your sessions? Leave a comment and let me know!

ActiveRecord callbacks are an easy way to run code during the different stages of your model’s life.

For example, say you have a Q&A site, and you want to be able to search through all the questions. Every time you make a change to a question, you’ll want to index it in something like ElasticSearch. Indexing takes a while and isn’t urgent, so you’ll do it in the background with Sidekiq.

This seems like the perfect time to use an after_save callback! So in your model, you’ll write something like:

app/models/question.rb
1
2
3
4
5
6
7
8
9
10
11
class Question < ActiveRecord::Base
  after_save :index_for_search

  # ...

  private

  def index_for_search
    QuestionIndexerJob.perform_later(self)
  end
end
app/jobs/question_indexer_job.rb
1
2
3
4
5
6
7
class QuestionIndexerJob < ActiveJob::Base
  queue_as :default

  def perform(question)
    # ... index the question ...
  end
end

This works great! Or, at least, it seems to. Until you queue a lot more jobs and see these errors show up:

1
2015-03-10T05:29:02.881Z 52530 TID-oupf889w4 WARN: Error while trying to deserialize arguments: Couldn't find Question with 'id'=3

Sure, Sidekiq will retry the job and it’ll probably work next time. But it’s still a little weird. Why can’t Sidekiq find the question you just saved?

A race condition between processes

Rails calls after_save callbacks immediately after the record saves. But that record can’t be seen by other database connections, like the one Sidekiq is using, until the database transaction is committed, which happens a little later. This means there’s a chance that Sidekiq will try to find your question after you save it, but before you commit it. It can’t find your record, and it explodes.

This problem is so common that Sidekiq has an FAQ entry about it. And there’s an easy fix.

Instead of after_save:

app/models/question.rb
1
2
3
4
5
class Question < ActiveRecord::Base
  after_save :index_for_search

  # ...
end

use after_commit:

app/models/question.rb
1
2
3
4
5
class Question < ActiveRecord::Base
  after_commit :index_for_search

  # ...
end

And your job won’t get queued until Sidekiq can see your model.

So, when you queue a background job or tell another process about a change you just made, use after_commit. If you don’t, they might not be able to find the record you just touched.

But there’s one more problem…

OK, you switched a bunch of your after_save hooks to use after_commit instead. Everything seems to work. Time to check it all in and go home, right?

First, you’ll want to run your tests:

test/models/question_test.rb
1
2
3
4
5
6
7
8
9
require 'test_helper'

class QuestionTest < ActiveSupport::TestCase
  test "A saved question is queued for indexing" do
    assert_enqueued_with(job: QuestionIndexerJob) do
      Question.create(title: "Is it legal to kill a zombie?")
    end
  end
end
1
2
3
  1) Failure:
QuestionTest#test_A_saved_question_is_queued_for_indexing [/Users/jweiss/Source/testapps/after_commit/test/models/question_test.rb:7]:
No enqueued job found with {:job=>QuestionIndexerJob}

Whoops! Shouldn’t the test have queued the job? What just happened there?

By default, Rails wraps each test case in its own database transaction. This can really speed things up. It takes just one database command to undo all the changes you made during the test.

But this also means your after_commit callback won’t run. Because after_commit callbacks only run when the outermost transaction has been committed.

When you call save inside a test case, it still commits a transaction (more or less), but that’s the second-most-outermost transaction now. So your after_commit callbacks won’t run when you expect them to. And you can’t test what happens inside them.

This problem also has an easy fix. Include the test_after_commit gem in your Gemfile:

Gemfile
1
2
3
group :test do
  gem "test_after_commit"
end

And your after_commit hooks will run after your second-to-last transaction commits. Which is what you were expecting to happen.

You might be thinking, “That’s weird. Why do I have to use a whole separate gem to test a callback that comes with Rails? Shouldn’t it just happen automatically?”

You’re right. It is weird. But it won’t stay weird for long.

Once Rails 5 ships, you won’t have to worry about test_after_commit. Because this problem was fixed in Rails about a month ago.


In my own code, I use after_commit a lot. I probably use it more than I use after_save! But it hasn’t come without its problems and strange edge cases.

Version by version, though, it’s getting better. And when you use after_commit in the right places, a lot of weird, random exceptions just won’t happen anymore.

When your data model gets complicated, and your APIs hit that sad 1 second response time, there’s usually an easy fix: :includes. When you preload your model’s associations, you won’t make as many SQL calls. And that can save you a ton of time.

But then your site slows down again, and you think about caching responses. And now you have a problem. Because if you want to get responses from the cache:

1
2
3
4
results = {lawyer_1: 1, lawyer_2: 2, lawyer_3: 3}
cached_objects = Rails.cache.fetch_multi(results.keys) do |key|
  Lawyer.find(results[key]).as_json
end

You’ve now lost all your :includes. Can you have both? How do you get a fast response for your cached objects, and still load the objects that aren’t in the cache, quickly?

There’s a lot to do, so thinking about it is tough. It’s easier when you break the problem apart into smaller pieces, and come up with a simple next step.

So what’s the first thing you can do? To do much of anything, you need to know which objects are in your cache, and which ones you still need to find.

Separate the cached from the uncached

So, say you have a bunch of cache keys:

1
cache_keys = [:key_1, :key_2, :key_3]

How can you tell which of these are in a cache?

ActiveSupport::Cache has a handy method called read_multi:

1
2
3
4
# When only lawyer_1 is cached

cache_keys = [:lawyer_1, :lawyer_2, :lawyer_3]
Rails.cache.read_multi(cache_keys) # => {:lawyer_1 => {"id": 1, "name": "Bob the Lawyer"} }

read_multi returns a hash of {key: value} for each key found in the cache. But how do you find all the keys that aren’t in the cache? You can do it the straightforward way: Loop through all the cache keys and find out which ones aren’t in the hash that read_multi returns:

1
2
3
4
5
6
7
8
cache_keys = [:lawyer_1, :lawyer_2, :lawyer_3]
uncached_keys = []

cached_keys_with_values = Rails.cache.read_multi(cache_keys)

cache_keys.each do |key|
  uncached_keys << key unless cached_keys_with_values.has_key?(key)
end

So, what do you have now?

  • An array of all the cache keys you wanted objects for.
  • A hash of {key: value} pairs for each object you found in the cache.
  • A list of the keys that weren’t in the cache.

And what do you need next?

  • The values for the keys that weren’t in the cache. Preferably fetched all at once.

That’s your next step.

Preload the uncached values

Soon, you’ll have to find an object using a cache key. To make things easier, you can change the code to something like:

1
2
3
4
5
6
7
8
9
cache_identifiers = {lawyer_1: 1, lawyer_2: 2, lawyer_3: 3}
cache_keys = cache_identifiers.keys
uncached_keys = []

cached_keys_with_values = Rails.cache.read_multi(cache_keys)

cache_keys.each do |key|
  uncached_keys << key unless cached_keys_with_values.has_key?(key)
end

So cache_identifiers now keeps track of the cache key and the object id to fetch.

Now, with your uncached keys:

1
uncached_keys # => [:lawyer_2, :lawyer_3]

And your cache_identifiers hash:

1
cache_identifiers # => {lawyer_1: 1, lawyer_2: 2, lawyer_3: 3}

You can fetch, preload, and serialize all those objects at once:

1
2
3
4
uncached_ids = uncached_keys.map { |key| cache_identifiers[key] }
uncached_lawyers = Lawyer.where(id: uncached_ids)
                         .includes([:address, :practice_areas, :awards, ...])
                         .map(&:as_json))

So what do you have now?

  • An array of all the cache keys you wanted objects for to begin with.
  • A hash of {key: value} pairs for each object found in the cache.
  • A list of the keys that weren’t in the cache.
  • All the values that weren’t found in the cache.

And what do you need next?

  • To cache all the values you just fetched, so you don’t have to go through this whole process next time.
  • The final list of all your objects, whether they came from the cache or not.

Cache the uncached values

You have two lists: one list of uncached keys and another of uncached values. But to cache them, it’d be easier if you had one list of [key, value] pairs, so that your value is right next to its key. This is an excuse to use one of my favorite methods, zip:

1
[1, 2, 3].zip(["a", "b", "c"]) # => [[1, "a"], [2, "b"], [3, "c"]]

With zip, you can cache your fetched values easily:

1
2
3
uncached_keys.zip(uncached_lawyers).each do |key, value|
  Rails.cache.write(key, value)
end

What do you have now?

  • An array of all the cache keys you wanted objects for to begin with.
  • A hash of {key: value} pairs for each object found in the cache.
  • A list of formerly-uncached values that you just cached.

And what do you still need?

  • One big list of all your objects, whether they came from the cache or not.

Bring it all together

Now, you have an ordered list of cache keys:

1
cache_keys = cache_identifiers.keys

Your list of the objects you fetched from the cache:

1
cached_keys_with_values = Rails.cache.read_multi(cache_keys)

And your list of objects you just now grabbed from the database:

1
2
3
4
uncached_ids = uncached_keys.map { |key| cache_identifiers[key] }
uncached_lawyers = Lawyer.where(id: uncached_ids)
                         .includes([:address, :practice_areas, :awards, ...])
                         .map(&:as_json))

Now you just need one last loop to put everything together:

1
2
3
4
results = []
cache_keys.each do |key|
  results << cache_keys_with_values[key] || uncached_lawyers.shift
end

That is, for each cache key, you grab the object you found in the cache for that key. If that key wasn’t originally in the cache, you grab the next object you pulled from the database.

After that, you’re done!

Here’s what the whole thing looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
cache_identifiers = {lawyer_1: 1, lawyer_2: 2, lawyer_3: 3}
cache_keys = cache_identifiers.keys
uncached_keys = []

# Fetch the cached values from the cache
cached_keys_with_values = Rails.cache.read_multi(cache_keys)

# Create the list of keys that weren't in the cache
cache_keys.each do |key|
  uncached_keys << key unless cached_keys_with_values.has_key?(key)
end

# Fetch all the uncached values, in bulk
uncached_ids = uncached_keys.map { |key| cache_identifiers[key] }
uncached_lawyers = Lawyer.where(id: uncached_ids)
                         .includes([:address, :practice_areas, :awards, ...])
                         .map(&:as_json))

# Write the uncached values back to the cache
uncached_keys.zip(uncached_lawyers).each do |key, value|
  Rails.cache.write(key, value)
end

# Create our final result set from the cached and uncached values
results = []
cache_keys.each do |key|
  results << cache_keys_with_values[key] || uncached_lawyers.shift
end
results

Was it worth it? Maybe. It’s a lot of code. But if you’re caching objects with lots of associations, it could save you dozens or hundreds of SQL calls. And that can shave a ton of time off of your API responses.

At Avvo, this pattern has been incredibly useful: a lot of our JSON APIs use it to return cached responses incredibly quickly.

The pattern has been so useful that I wrote a gem to encapsulate it called bulk_cache_fetcher. So if you ever find yourself trying to cache big, complicated data models, give it a try!

You’re ready to launch your first production app, and it’s time to get it talking to some external services. You still have to get everything hooked up. So what’s the best way to configure your services in production, without making things more complicated on your dev machine?

Set up your Environment

To configure production apps, today’s best practice is to use environment variables (those ENV["REDIS_HOST"]-looking things).

But why?

  • It’s harder to accidentally commit your production keys.

    If you’re not paying attention, you might git push a file with important secret keys in it. And that could be an expensive mistake.

  • Configuration is what environment variables are there for.

    Environment variables are a common way to configure apps on almost every kind of system. Many other programs (like Ruby) use environment variables for configuration, so it only makes sense to try in your own app.

  • Environment variables are easy to set up in production.

    Heroku has a a web UI and a command line tool for easily setting environment variables. And if you’re building your own server, server management tools like Chef and Docker make setting environment variables easy.

What does it look like on the Rails side?

This is how an app that depends on environment variables could configure itself:

config/my_service.yml
1
2
3
production:
  host: <%= ENV["MY_SERVICE_HOST"] %>
  port: <%= ENV["MY_SERVICE_PORT"] %>
config/initializers/my_service.rb
1
2
3
my_service_config = Rails.application.config_for(:my_service)

my_service = MyService.new(my_service_config["host"], my_service_config["port"])

The initializer uses Rails 4.2’s config_for method to find the right .yml file and pick the right environment.

Then, config_for runs the ERB code inside my_service.yml, and grabs MY_SERVICE_HOST and MY_SERVICE_PORT out of the environment. It passes those values along to MyService.

You could also just have the initializer read from ENV["MY_SERVICE_HOST"] directly. But I prefer to keep them in .yml files, for reasons you’ll see in a minute.

Your app’s configuration in development

Environment variables are fine for production. But once you set up your production config, how do you handle development and test mode?

You have a few options. But I usually follow the convention in Rails’ config/secrets.yml: use environment variables in production, and hardcode non-secret values in development and test.

With the development and test environments, config/my_service.yml could look like this:

config/my_service.yml
1
2
3
4
5
6
7
8
9
10
11
production:
  host: <%= ENV["MY_SERVICE_HOST"] %>
  port: <%= ENV["MY_SERVICE_PORT"] %>
  
development:
  host: localhost
  port: 8081
  
test:
  host: localhost
  port: 8081

Awesomely enough, the initializer can stay exactly the same. The values in this file will be used in the development and test environments, and the production environment will get its values from the environment variables.

But why would you hardcode these values?

  • The configuration values are easier to see and change.

    You can tweak your config as you experiment with new features, which is something you want in development, but not so much in production.

  • It’s easier for someone new to get started.

    If all the sample config you need is checked into your git repository, a new dev just has to clone and run your app. They won’t need to muck around with setting just the right values to get the app working.

  • You don’t have to worry about conflicting environment variables.

    You’ll probably work on more apps on your dev machine than you’ll ever deploy to a single production machine. If you used system-wide environment variables to configure all those apps, there’s a good chance two of them will stomp on each other.

So, try using environment variables in production, and hardcoded .yml config in development. It’s easy, it’s readable, and Rails has built-in support for dealing with exactly those kinds of config files.

Another option for development

There’s another way to handle configuration in development mode: dotenv. It looks neat, but I haven’t tried it in an app of my own yet.

With dotenv, you can put environment variables in a file named .env in your Rails app’s root directory, and those values will get picked up by your app. This is nice, because your development environment acts more like your production environment. That’s a good way to avoid bugs that only ever happen in production.

It’s something I’ll try someday. But for now, I haven’t found anything more convenient than .yml and config_for.


Most production apps need some kind of configuration. So when you deploy your next app, try using .yml files, populated by environment variables in production, to configure it. You’ll get the flexibility, the simplicity, and the reliability you’re hoping for.

Do you have a different way you like to configure your production apps? Leave a comment, I’d love to hear about it!

Is your GitHub contribution chart solid gray? You could use an open source project to work on. But you don’t have to start it from scratch. The easiest way to create a useful side project is to pull it out of the app you’re already building. That’s how Rails was born!

But how do you know what to extract? And how do you turn it into a gem, without destroying your workflow?

Find the code you want to extract.

Somewhere, deep inside your app, is some code that doesn’t belong there. Code that doesn’t need your app to do its job. Where is it?

Sometimes, you’ll just have to guess. But I often find extractable code in the same few places:

  • Validations

    Have you written any custom validations for your attributes? Those can make great gems.

  • Changes you’ve made to Rails

    Every once in a while, you’ll have to mess with Rails to get commit hooks working in tests or blank attributes to turn into NULL in the database. Whenever I’ve moved this kind of logic into a gem, it’s become more stable and easier to understand.

  • Non-activerecord models

    Do you do so much email address or phone number parsing that you moved it into its own class? These classes are often useful in other apps, and they’re pretty easy to turn into gems.

  • Mock objects and custom assertions

    You can write more readable tests by using custom assertions. And once you have some good custom assertions written for a library or pattern, they’re helpful to anyone else who uses that library or pattern.

You don’t have to think big. Some of my favorite gems are just one file!

And if you still can’t decide what to extract, browse the category list on RubyToolbox for some inspiration.

Put your code in a gem-like directory

Once you know which code you’ll turn into a gem, move that code around so it fits a gem-like structure inside your app.

In Rails apps, I use lib/ as a gem staging area. lib/ is where I put code that has the potential to turn into its own gem. So, if you’re creating a gem called “json_api_client”, your Rails app’s lib/ directory might look like:

1
2
3
4
5
...
my_rails_app/lib/json_api_client.rb
my_rails_app/lib/json_api_client/associations.rb
my_rails_app/lib/json_api_client/connection.rb
...

Most gems will have a file under lib/ named after the gem (lib/json_api_client.rb), and a bunch of files in a directory named after that gem (everything under lib/json_api_client/). So, take that same structure and match it inside your Rails app. That will make it much easier to move code into the gem later on.

If you’re confused about what a gem layout looks like, take a look at some of your favorite gems’ source on GitHub. You’ll pick up the pattern pretty quickly.

What about tests?

I used to follow lib/’s structure inside test/unit/:

1
2
3
4
5
...
my_rails_app/test/unit/json_api_client_test.rb
my_rails_app/test/unit/json_api_client/associations_test.rb
my_rails_app/test/unit/json_api_client/connection_test.rb
...

It worked OK, even if putting models and libraries in the same folder got a little messy.

Now, though, Rails uses test/models/ instead of test/unit/. And storing your lib/ tests inside test/models/ doesn’t make a whole lot of sense. I haven’t really decided on a convention for this yet. Do you have any suggestions?

Break the dependencies

Once your code is inside a gem, it won’t be able to depend on your app. This means you’ll have to go through the code you put in lib/, and look for places where it depends on classes, objects, or behavior specific to your app.

If you do find any of these dependencies, you’ll have to break them. There’s a lot of great writing about how to break (or inject) dependencies, so I’m not really going to go into that now.

Create a gem

I use bundle gem to create my gems. Specifically, to create a gem called bulk_cache_fetcher:

1
bundle gem bulk_cache_fetcher -t minitest

The -t adds some test helper files and the test tasks to the gem’s Rakefile.

You’ll have to do some housekeeping next, like filling out the .gemspec, writing the README, picking a LICENSE, all that stuff.

And then, since you already have your gem’s code in your Rails app’s lib/ folder, you can just move that code and its tests into the lib/ and test/ folders in your new gem.

A lot of the time, there’ll be things you missed or forgot about, or code that assumes things about your app that aren’t true inside a gem. So, before you move on, run the tests you moved into your gem, and make sure they all pass.

Use your new gem in your app

Now that you have a gem, you want to use it, right? But testing changes to your gem inside your Rails app can get annoying, quickly. You have to:

  1. Make the change in your gem
  2. Build the gem
  3. Remove all traces of the gem from your system, or update the version
  4. Install the gem
  5. Restart your server

This is pretty awful. Luckily, bundler gives you an easier way. Say you’ve created your gem in ~/Source/bulk_cache_fetcher. While you’re testing gem changes, you can write this inside your Rails app’s Gemfile:

Gemfile
1
gem "bulk_cache_fetcher", path: "~/Source/bulk_cache_fetcher"

Next, run bundle install, and you’ll be able to make changes to your gem as if that code still lived in your app’s lib/ folder.

One last thing: make sure you remove the path: before you check in your code! That path may not point to the right place on other systems, so chances are it won’t work anywhere except your machine.

Build, ship, and enjoy!

Once your gem is ready, you can send it out to the world.

So, sign up for an account on RubyGems if you haven’t already, check in your changes, and run rake release. Congratulations, you’re now a gem author! And once you push that gem to github, you’ll get your nice green square for the day.

Do you have any parts of your own apps that seem extractable? Anything that you think might make a good gem? I’d love to hear about it – just leave a comment below.

And if you want to learn a ton more about creating, managing, and maintaining Ruby gems, I highly recommend Brandon Hilkert’s Build a Ruby Gem. The process his book follows is very close to this one, and it covers a whole lot more.